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1. Introduction. The numerical solution of boundary value problems for partial 
differential equations usually requires the solution of large systems of linear equations. 
The order, n, of such systems is essentially equal to the number of mesh points in the 
domain under consideration. Since direct inversion procedures require the order of n° 
operations they are not practicable, even using high speed digital computers, for reason- 
able meshes in two or more dimensions. Thus iterative methods for solving linear systems 
are of great interest as they usually require the order of n® operations. In addition the 
coefficient matrix of the system which results from the finite difference approximations 
has many strategically placed zeroes. However, no special account of these zeroes is 
taken in most direct inversions or in general iterative procedures. It is reasonable to 
expect that particular methods, designed in accordance with the general structure of 
the coefficient matrix, could further reduce the number of operations. Many such special 
iteration schemes have been devised and conditions on the coefficient matrix which are 
sufficient to insure the convergence of some of these methods have been obtained [1, 7, 9]. 
However there is no general comparison procedure to determine which of many possible 
methods is ‘“‘best”’ in a given case. 

In the present paper we formulate a family of iterative schemes for a particular 
class of coefficient matrices (in which the zeroes are placed as in the usual five-point 
Laplace difference equations). This family is defined by a generalization of the usual 
notion of extrapolation or over-relaxation. It is then possible to formulate the problem 
of finding the ‘“‘best”’ scheme and, more important, some general theorems on the eigen- 
values of these schemes are proved. 

The theorems are used to define three subclasses, called complete image classes, of 
the general family. These classes contain many of the schemes in current use as well as 
generalizations of them. Thus it is shown that a variety of independently proposed and 
seemingly unrelated iterative methods are special cases of a general class of methods. 
These complete image classes are such that each of the eigenvalues of any scheme in 
the class is a given function of one of the eigenvalues of a particular reference scheme 
of the class. Thus a knowledge of the eigenvalues of the reference scheme permits, in 
principle, the determination of the best scheme of the given class. 

A special class of equations is considered for which all the eigenvalues of each reference 
scheme can be explicitly written in terms of the eigenvalues of two matrices. It is then 
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possible to determine which of the reference schemes is best and hence it is also possible 
to determine that scheme which is best of all those in the complete image classes. 

Asan example of the application of this general theory the Laplace difference equations 
are considered. The well-known results on rate of convergence for the Richardson, Lieb- 
mann and extrapolated Liebmann methods are immediate consequences. Analogous re- 
sults are obtained for the less well-known “‘line”’ methods* which are shown to be superior. 

It is clear that the methods of the present paper can be applied to more general 
iterative schemes than those considered. In addition many of the present results can be 
easily extended to problems in higher dimensions and with more general boundaries. 

2. Formulation. <A large class of two dimensional linear elliptic difference equations 
are of the form: 

bi5 — Lisbi-n.§ — Tisdiar.s — Obs 5-1 — bisa = 84; 5 (2.0) 

within a coordinate rectangle specified by 1 < i < p,1 < j < gq. On the boundaries of 
the domain equations of the form (2.0) hold with the coefficients: 


LL; =7,; = 0, [355 ¢; ba = t, = 0, 1<t<p. (2.1) 


Such equations are obtained from second order? elliptic partial differential equations by 
applying the usual second order difference approximations on some coordinate mesh, 
(é; , n;). The coefficient matrix of the resulting system, (2.0) and (2.1) must be non- 
singular and, with a little care in differencing [1], can be made positive definite (and 
symmetric if the equation is self adjoint). However, we shall assume, unless otherwise 
stated, only the non-singularity. 

For convenience of notation and discussion we introduce, for each j, the p-dimensional 


column vectors: 


6=/%]  g = |*}. (2.2) 
and the p X p order matrices: Dy; L 8, 
o 0 (Or y ae 
i.; 0 O Tr, 
l 0 
L;= | 0 R; = , ts 358: CD 
| 0 
0 | eo 
0 L, 0 0 0 


*The origin of iterative line methods is obscure. They have been in use by Russian mathematicians 
for a number of years. About 1945 J. von Neuman and L. H. Thomas independently proposed line 
methods for parabolic difference equations. An independent investigation was initiated by M. E. Rose 
and the present author in 1953 and some of the results of Sec. 8 were then obtained. Peaceman and 
Ratchford have studied their applicability to parabolic difference equations and double-sweep iterative 
solutions of the Laplace difference equations. 

tThe five point scheme implied by (2.0) assumes no mixed partial derivates in the equations. 
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B, = | os ; l<jsq T= fa l<j<q. 
Lo by, , tJ 
The system (2.0-.1) can now be written as 
(J — L, — R,)®, — T,% = 8, ; 
-B,%;-, + UT — L; — KR); — 7;%4, = 8; , 2<j5¢q-1; (2.4) 
—B,?,,+( —L, —R,)&, = S 


a? 
Here we have introduced the identity matrix, 7, which is always assumed to be of the 
same order as the square matrices to which it is added. 


further simplification is obtained by introducing the (p X q)-dimensional column 
vectors* (or g-dimensional compound vectors): 





?, $1 | |S, | $11 
| 
?, P2\ . S, | So, | 
© = . Pe Smi-*| =|}; e 
. . | (2.5) 
| | 
?, Prva t S, Spa 
and the [(pq) X (pq)]-order matrices (or [q X q]-order compound matrices): 
L, 0 | R, 0 | 
L, R, 
I , R= i is 
0 I 0 R, 
(0) 0 Re 0 
B, 0 0 7 
| 
B, 0 | 
B 1 |. (2.6) 
| 
0 7 
0 B, 0 0 0) 
The system ol linear equations (2.0-.1), or (2.4) now becomes 
Mo ¢-L—-R-B-T)@= S. (2.7) 


Similar formulations may be introduced for more general boundaries and in higher 


dimensions. In particular, if the boundaries are composed of coordinate segments, the 


*The vector @ of (2.5) determines an “‘ordering”’ [1] of the unknowns, ;;; but this order need bear 


no relationship to the sequence in which the iterative computations are carried out. 
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matrices L; and R; remain square but of different orders while the matrices B,; and 7; 
become rectangular. Another pair of “‘neighbors’’, ¢;;,4: , appear in (2.0) with each unit 
increase of the dimension and, correspondingly, additional pairs of matrices must be 
introduced in a manner similar to those of (2.6). 

3. General single sweep iterations. We summarize here some terminology and 
known results for a class of iterative methods for solving (2.7); this class is defined as 
follows: Let the coefficient matrix, M, be written as 


M=N -P, (3.0) 
where | VN | + 0. We call this a “splitting” of the coefficient matrix and the system 
(2.7) becomes 

N@ = Pé+ S, (3.1) 

with the formal solution 
@ = (N — P)'S = (I —N'P)'N'S. (3.2) 
The iterative procedure is defined, starting from some arbitrary guess, &“’, at the solution 


vector, by the recursion 
Ne” = Pe” + §. (3.3) 


Thus in one sweep through the mesh a new iterate is obtained. Applying (8.3) recur- 
sively yields for the vth iterate 


oe” = (I+ (NP) + (NP)? + --- + (NP) "NUS + (N'P)’®™. (3.4) 
Thus [2] 6°” — @ as v > ©, for arbitrary °°’, if and only if 


Lim (N7'P)’ = 0. (3.5) 


yo 
. . . . . . . . © ry-1 
This condition is satisfied provided some appropriate norm [3] of (N~ P), say the spectral 


norm, is <1. 
This result is more frequently obtained by introducing the sequence of error vectors 


E” =6-— 96", (3.6) 
which, by (3.1) and (3.3), must satisfy the homogeneous recursion 
NE” = PE’, y>1. (3.7) 
The eigenvalues, A, , of N~'P are the (pq) roots of the characteristic equation 
|AN — P| =0. (3.8) 
If they are distinct* there then exists [4] a complete set of eigenvectors, e, , satisfying 
Ne, = Pez , (3.9) 
which span the (pqg)-dimensional vector space. Thus any initial error, Z“’’, has a unique 
expansion in these eigenvectors of the form 


EB” => Gm. (3.10) 
k=1 


*It is sufficient here to assume that all the elementary divisors [4] of (N~'P) are simple. 
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By (3.7) and (3.9) the above yields, for the vth error vector, 


Pa 
E = > Nae « (3.11) 
k=1 
In order that &'”’ — ® it is necessary and sufficient that E'’’ — 0. Thus by (3.11) and 


the completeness of the eigenvectors, convergence is equivalent, for arbitrary initial 
* 
error’, to 


Amex = max |r, | < 1. (3.12) 
k 


Let us require that the most slowly decaying component in the vth error, (3.11), 
be reduced by at least 10°”, where m > 0. Then we must have X*,,, < 10°” and the number 
of iterations required is bounded by 


vy > —m/log Amx = m/R. (3.13) 


This result is valid only when (3.12) is satisfied and then R = —(log Amax) is called 
the rate of convergence. This quantity is useful in comparing different iterative methods 
as the number of iterations required for some specified convergence criterion varies as 
R. ' 

If the elementary divisors of N~'P are not simple, condition (3.12) still suffices for 
convergence but the bound (3.13) does not apply. To obtain such a bound we assume 
the elementary divisor of largest order corresponding to Aya. = | A, | Say, is of order 
r + 1. Then the expansion (3.10) must be replaced by [4] 


Pa 


r+1 r+1—-k 
= 81°F Orrenha S nee a.19 
k=r+2 


k=1 2=0 < 
provided vy > (r + 1) (and assuming all other divisors to be simple). The largest decay 
factor is now given by, for vy > (r + 1)Amax + 7, 


> Vv 
Recs ( ); 
Tr 


and to reduce the error by at least 10°” requires 


_— (") ew. 
r 


An approximate bound on the number of iterations which may be obtained from this 


inequality, is 


vy > m'/R +r 4 (r/2R) log [(m’/R + r)m’/R], (3.15) 


where m’ = m — log r!. The number of iterations required for convergence is now not 
simply proportional to R~’ but this result is asymptotically (for m — ©) equivalent to 
(3.13). For practical computations the discrepancy may be significant. 

A large rate of convergence should not be the only criterion in evaluating iterative 
methods. Rather a measure of the time required (by human and/or machine effort) to 
achieve the desired accuracy should be employed. This time is essentially proportional 


*For a special initial error in which the amplitude of some particular component vanishes, say 
a; = 0, the corresponding eigenvalue, \,, is unrestricted. However, even if such initial distributions could 
be determined, computations with roundoff would undoubtedly introduce this component at an early 
stage of the iterations and it could become arbitrarily large if (3.12) is violated. 
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to the total number of arithmetic operations (property weighted). Thus if we let N,, 
be the operational count required for each iteration [i.e. to solve (3.3)], the total time is 


proportional to [assuming (3.13) to be valid], 
T =N,,/R. (3.16) 


The “best’’ of the single sweep methods is that for which 7’ is minimized. Of course in 
more general procedures a similar criterion should be employed. 
4. Special single sweep iterations and generalized extrapolation. With the above 


notions in mind we consider the special class of splittings 
NO vol — vib — 7.R — y3:B — yT, P(y) = N(y) — M, (4.0) 
where the real numbers y, are restricted by the conditions that 
cl) N(y) ~ U, b)  Viv2¥s¥4 Q. (4.1) 
This defines a subset [ of the five dimensional Euclidian space of all points y. The 
iterations are defined by 
N(y)®” P(y)®”" ' +S. (4.2) 


Thus any point y e I determines an iterative scheme given by (4.0) and (4.2); we shall 
sometimes refer to this as the scheme re The class of schemes, IT, is important since it 
includes many known and frequently used schemes while the data arrangement. required 
for all of these schemes is well suited for automatic computing machines. Furthermore, 
the solution of the system (4.2) is obtained explicitly when y e« T in one sweep over the 
mesh by solving either two-term or three-term recursions. In particular if either yoy,y. # 0 
Or Yovx7; * O three-term recursions are introduced along horizontal or vertical mesh 
lines. The solution of these recursions may be reduced, by the well-known factorization 
of a Jacobi matrix, to the evaluation of two, two-term recursions along the appropriate 
lines. If two or more of the y; + 0 the method of sweeping the mesh to solve (4.2) in 
one sweep is partially determined (i.e. the sweep must start ata particular corner or 
with some line next to a boundary). 

The operational counts, V,,(y), required to solve (4.2) for any of the schemes y ¢ T 
vary at most by a factor less than two. The minimum number of operations needed is 
four multiplications and five additions at each mesh point and the maximum required 
is seven multiplications and six additions (neglecting the operations done only once in 
factoring the matrix of Jacobi form). The special subclasses of I introduced in Sec. 6 
require at most six multiplications and five additions at each point. Thus in the remainder 
of the paper we shall assume the operational counts to be almost equal and in seeking 
the ‘‘best’’ scheme we shall consider only the eigenvalues. 

The eigenvalues of a scheme such as (4.0) to (4.2) are the roots \ of the characteristic 


equation 


or explicitly, of the equation 
A gol — gb — gR — g,B — gsT 0, (4.3) 


where 
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Each root, \,(y), is thus a function of the scheme, y, and a problem naturally suggested 
is to find a y* e T such that 
Nmax(¥*) = min (max | A,(y) |). (4.4) 
vel h 
The solution of this problem would yield, essentially, the best iterative method of the 
class considered. The usual notion of extrapolation (or over-relaxation) of the scheme 
y* is meaningless, by virtue of (4.4), since no improvement on rate of convergence can 
be obtained. In fact the usual extrapolation techniques may be viewed as attempts to 
find the solution of (4.4) when y is further restricted to lie on some particular curve in T. 
This is in fact shown to be the case in Sec. 6 and the curve, in the usual extrapolation 
procedures, is a straight line. The problem posed in (4.4) is thus a generalization of 
extrapolation in which the values of four extrapolation parameters are to be obtained 
[since, by (4.1b), T consists of four four-dimensional subspaces]. 
In order to obtain some idea of the possible behavior of \(v) we consider schemes y 


on the “diagonal’’ line 
Yo = = % = %3 = 1 = I/e. (4.5) 


These points do not lie in T and, obviously, taking a = 1 implies direct inversion of MM. 
However, to gain insight, we use (4.5) in (4.3) and obtain the characteristic equation: 


Ee —1)+ 1 aw = @. 
a 


Since, by assumption, | 1/ | # 0, there is only one eigenvalue (with multiplicity p 
. . eS . 


and it is 


Thus along the diagonal line (4.5) all schemes converge for 0 < a < 2 (Le. y, > 1,2) 


and diverge otherwise (i.e. y, < 1 2). Only one iteration is required for a = 1, as then 
\ = 0. It might be suspected that the best scheme in T is obtained by taking a point 
earest this unit point [a = 1 in (4.5)]. That this is not the case is a simple consequence 
of the results of Sec. 6. The eigenvalues of the above example become arbitrarily large 


n absolute value as a — + © and the point y approaches the origin; the roots approach 
unity as a— +0 and the point y recedes to infinity. It should also be observed that these 


eigenvalues are independent of the matrix J/. 

5. Some properties of the eigenvalues, \(vy)._ We present here some theorems which 
ean be used to compare the eigenvalues of various schemes y. All of these results are 
simple consequences of the 

Fundamental theorem: Let L, R, B and T be arbitrary matrices of the form (2.6), 


(2.3). Then for any non-zero scalars x and y. 


_— 


M ee es ee es ee es = (5.0) 


Proof. Let the elements of M be M,, where 1 < r, s < N = pq. Then each term in 
the formal expansion of | M/ | is given by [5] 
+M, eM, mw (2) MM, ctr) +> My, N) 9 (5.1) 


where 7 is one of the N! permutations of the first N integers. Let each point (7, j) of 
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the original rectangular mesh (see Fig. 1) be identified with a unique integer r = 
(j — 1)p + 7, and represent any permutation by the N vectors from r to r(r) on this 
mesh. By the definitions (2.6), (2.3) and (5.0), M,,.:-) # 0 only if r(r) = r or the point 
corresponding to z(r) is an adjacent neighbor of the point r. Thus the only permutations 
which lead to non-vanishing terms (5.1) are those whose geometric representation is 
composed entirely of unit vectors in the (+2)- and (+))-directions and null vectors. 
However, any permutation is a product of disjoint cycles [5] and the representation of 
a cycle is a closed path on the mesh. Thus for non-vanishing cycles there are the same 
number of unit vectors in the (+7)-direction as in the (—7)-direction and similarly 
for the (-+j)-directions (see Fig. 1). Since W,,,,,) is an element of ZL if r(r) = r — 1, 
and an element of R if r(r) = r + 1, there are as many factors from L as from R in 
each non-vanishing cycle. A similar result is true of factors from B and 7. Thus each 
non-vanishing term in the expansion of the right hand determinant in (5.0) is independent 


of x and y and the proof is complete. 
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Fia,j'l. Geometric representation of a non-vanishing cycle. In the permutation of, n — a(n), which this 
cycle is a factor the cycle is given by: 
mr) =r+p, ur+p)=r+2p,..... . mr—l)=r. 

The geometric form of the above proof was suggested by D. Ludwig. Another proof 
has been given by B. Friedman [8]. However, the geometric proof clearly indicates 
that the same result can be proved for a matrix which represents the difference equations 
on any connected region bounded by coordinate segments. In higher dimensions the 
analogous theorem, obtained by adding two more matrices to M for each additional 
dimension, is e isily proved. A special case of the above theorem, namely that obtained 
by setting x = y in (5.0) has been proven by D. Young [1]. Many of the remaining 
results in this paper hold for any matrix WV which can be written in the form (2.7) such 
that (5.0) is satisfied for arbitrary x and y 

The first, almost obvious, consequence of the fundamental theorem is contained in 

Theorem I. Let (vy) = Ayo, ¥1 5 Yo » Y2 » Ys) De any eigenvalue of the scheme y. Then 
for all y: 

Ny) = AlYo » Vo 5 V1» Va» Vs) = AlYo 5 V1» V2 » Ya » Ya)- (5.2) 
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Proof. The eigenvalues \(y) are the roots of the characteristic equation (4.3). How- 
ever, by the fundamental theorem we have 
| | 
1 1 
A= | gol — xg,L — P- g2R — yg,B — : gsT f 
and taking « = (g2/g,)'”, y = (g:/gs)'”* the above yields* 
A = | gol — (gig.)'” (L + R) — (gsgs)'? (B + T) |. (5.3) 


Thus the roots of A = 0 are invariant under the interchanges y, <> y, and 7; © ¥, . 

In terms of iterative procedures of the class (4.2), Theorem I indicates that various 
ways of sweeping the mesh yield identical rates of convergence. In terms of the subspace 
I the theorem states that there are certain two-dimensional planes with respect to 
which the eigenvalues are symmetric. Thus the volume of I which must be searched 
for a best scheme, y*, is reduced by a factor of 1/4. Furthermore it is of interest to note 
that along the two-dimensional planes y, = y, and y; = 7, the eigenvalues A(y) must 
have relative maxima or minima with respect to the appropriate pair of coordinates. 

A somewhat more general result which contains Theorem I as a special case is con- 
tained in 

Theorem II. Let \'(y’) be some particular eigenvalue of the scheme y’. For any 
scheme y let A(y) be a function such that 


(2) = 0% _ Ih 20, (5.4) 
Jo Jig 939 

where g; and g; are respectively the functions of (\’, y’) and (A, y) defined in (4.3). 
Then A(y) is an eigenvalue of the scheme y. 

Proof. Form the determinant A of (4.3). Then as in the proof of Theorem I we obtain 

(5.3). Using (5.4) this becomes 

A = #* | giI — (gigi) 7(L + R) — (9393) 7(B + T) | = #4’, 
by another application of the fundamental theorem to the scheme y’. However, since 
\’(y’) is an eigenvalue of y’ we have A’ = 0. Thus A = 0 and X(y) must be an eigenvalue 
of ¥. 

The schemes y and y’, and the eigenvalues A(y) and \‘(y’) of the theorem will be 
called images of each other. We sometimes refer to y’ or \’(y’) as the reference scheme 
or reference eigenvalue respectively. It is clear from (5.4) that the relationship of being 
images is a transitive one. Some consequences of Theorem II are examined in the remain- 
ing sections. 

Another theorem which is quite useful for some special difference equations is 

Theorem ITI. Let the matrices (2.6) be such that (L + R)(B+ T) = (B+ T)(L+4+ R). 
Then as is well known these matrices have common eigenvectors; let p, and u, be the 
eigenvalues of (L + R) and (B + 7) respectively, corresponding to the common eigen- 
vector e, . Then each eigenvalue of any scheme 7 is a root, for some k, of the (at most 
4th degree) equation** 

he (9:92) "”" px = (939s) "ue = 0. (5.5) 

*The chosen branches of the square roots are arbitrary and hence all future applications of (5.3) 


could be written in any of four ways. 
**As pointed out in the proof of Theorem I there are four such equations which could be used. This. 
point is clarified in Sec. 7 where more of the consequences of the hypothesis are examined. 
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Proof. Let \(y) be some fixed but as yet unspecified eigenvalue of the scheme y. 

Then from (4.3), A = 0, and as in the proof of Theorem I, we have from (5.3) 
Jol — (gige)'*(L + R) — (gsgs)'*(B + T) | = 0. (5.6) 


Thus the matrix in (5.6) is singular and has a zero eigenvalue. Applying this matrix to 


€, we see that 


Jo — (Gig2) “pr — (gsgs)' “hy 


Str 


is an eigenvalue belonging to the eigenvector e, . Using all the e, we obtain all of the 
eigenvalues of the matrix in (5.6). However at least one & = 0. Since A(y) was any 
eigenvalue of the scheme y the theorem follows. 

This theorem is applied generally in Sec. 7 and to the usual Laplace difference equa- 
tions in Sec. 8. 

6. The complete image classes. ‘The pairs of image schemes defined by Theorem II 
are such that only one eigenvalue of any y need be the image of one eigenvalue of the 
reference scheme, y’. However, of special interest are those schemes each of whose 
eigenvalues is the image of a corresponding eigenvalue of some particular reference 
scheme. Such classes of schemes are called complete image classes and we proceed to 
obtain three of them. The intersections of these classes with the subspace [ is grouped 
into five subspaces: T, , T's, Tg, [ce and l'¢. , of which [',, and I. are considered 
uninteresting. 

The class T, . In order that any two schemes y and y’ be images, (5.4) must be 
satisfied by the corresponding pair of image eigenvalues. However, let us seek first 
schemes such that 

oe = (6.0) 
Jige2 J39s 


is an identity in the indeterminates \ and X’. This requires 


a) V1V¥2 = YVs¥1 b) | Vi¥2 = 375 
4 sil 4 ‘2 (6.1) 
B F — 3 a a , , 
LY1 T V2 = ¥3 + Ys » iyi +¥2 = ¥3 +7 - 


Now let \’ be any eigenvalue of a scheme y’ which satisfies (6.1b). Then if y is any 


scheme satisfying (6.la) every root d of 


Jo\- i Je : 
(22) - Hd (6.2) 


, , 
J Jig 
is an eigenvalue of y by Theorem II. Since (6.0) is satisfied for any \ and X’, each equation 
(6.2) obtained for a different eigenvalue \’ of y’ determines at least one eigenvalue ) of 
the scheme y. Thus all schemes y satisfying (6.la) belong to the same complete image 
class, say class A. Since the conditions (6.la) and (6.1b) are identical any y ¢ A may be 
used as the particular reference scheme. We call the chosen reference scheme y, and 
take it to be 
Ya :¥o = 1, 1 = 2 = ¥3 = Vs = O. 

This is a simultaneous displacement method commonly known as Richardson’s method. 
Thus each of the eigenvalues of any class A scheme is the image of one of the Richardson 


eigenvalues. 
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The class A schemes of interest are those contained in [. From (4.1) and (6.1) we 
obtain the set. of all such schemes, I’, , and they are listed in Table A. The real parameters 
a and £6 are to be restricted such that (4.1a) is satisfied. The set I’, lies on four two- 
dimensional planes in I’, one of which is, say: y2 = ¥; = 0,7, = ¥3 . By the symmetry 














1 /a 1/8 0 1/8 0 
I, _ — a ita - ian _ ee | 
1 /a 0 1/p | 0 1/6 | 

l/a 1/8 0 0 1/8 

YA 1 0 0 0 0 


TABLE A 


property of the eigenvalues, expressed in Theorem I, we need examine only one of these 
four planes. However using any y e I'y and y, in (6.2) yields 


ow C8 ae 7 AZ — (1 — 8)]. (6.3) 


This equation furnishes the mapping of the eigenvalues \, of y, (Richardson) onto the 
eigenvalues \ of any of the schemes in T, . 

If we choose a = 8 = linany y eT’, the resulting scheme is a successive displace- 
ment method commonly known as a Liebmann scheme, in the present connection, or 
more generally as the Gauss-Seidel method (see Table I). The mapping (6.3) yields 


A=Ai, or A=0. (6.4) 


Thus some \,4 go into zero and others into their squares. These schemes converge, as is 
well known [7], if the Richardson scheme converges, and the rate of convergence is 
twice as large (see Sec. 3). 

If we set 8 = lin any ye I, the resulting scheme is a successive overrelaxation [1] 
method where a is the overrelaxation parameter (see Table I); this method is frequently 
called the extrapolated Liebmann scheme [6]. The schemes y, in this case, lie along four 
lines in I’, , for example: y. = y, = 0, 71 = ¥3 = 1. The mapping (6.3) becomes 


[\ — 1+ a]’ = a’AiX, (6.5) 


which has been studied thoroughly [1]. 

Taking a = 6 in any y e I, yields a different successive overrelaxation method; 
this corresponds to using successive displacements (Liebmann) over the entire mesh 
and then extrapolating (or interpolating) the provisional new iterate with the old iterate. 
The mapping (6.3) becomes, in this case, 


A= 1-—a(1 — AJ, (6.6) 
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which is easily analyzed. If the \,4 are all real and Anax,4 < 1, then Ams, is minimized by 
taking 

= [1 — 3(Mnax.a + Ania.) ; 
where Amin,a = min, | A,(y4) |. This method is the analogue of the case treated at the 
end of Sec. 4; it is called a full-mesh extrapolation. 

If all of the eigenvalues A, are real and Amax,4 < 1 the general mapping (6.3) can be 
analyzed. It is found, in this case, that the best of the class A schemes is extrapolated 
Liebmann. However, in other cases, it seems possible to improve upon the Liebmann 
extrapolations. The analysis of this mapping has been done by K. Gordis and will be 
reported in a future paper. 

The class T', . Proceeding as in the previous case we now require 


(2) — 9:92 
/ — er 
Jo 9192 
to be an identity in \ and X’. This is equivalent to 


a) Y=n=%, b) w=n=%- (6.7) 
Using (4.1b) these relations yield the indicated classes T', and Ig. of Table B. 



































| | 
Yo 71 | v2 Ys | 
| | Ife l/a | 1/a 1/8 0 | 
Ig | 

| l/e 1/a l/a 0 | 1/8 
\—————=-| 
| Te’ 0 0 | 0 1/a ua 1/6 | 
ve | 1 | 1 | 2 | o | 0 oe 

| 





TABLE B 


The canonical reference scheme, yz, , included in the table is a simultaneous line-dis- 
placement method; by its analogy with y, we shall call it a line-Richardson method. 
The classes of schemes I’, and Ig. lie on three two-dimensional planes in T’. As before 
if we take \, to be any eigenvalue of y, then by Theorem II any root \ of 


GG _ (2) 

9394 Jo 
is an eigenvalue of y e ', (or I',-) of the table. Of course the appropriate y e I’, (or T',-) 
is to be used in the g,; . Using y e I'y and y, of the table this yields 


r AZ — (1 — 8)); (6.8) 
the same mapping (6.3) as in the [', schemes. However, the reference schemes y, and y, 
are not images of each other and thus we cannot, in general compare their eigenvalues. In 
Secs. 7 and 8 special cases are considered in which \, and \, may be compared; it is 
then clear that some schemes in I’, are better than the best scheme in T, . 

All of the simplifications (6.4-.6) are shown to apply for y e I's by choosing the 


[A-— (lL -—a)} = 
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same special a and 6 as were chosen for y ¢ I'4 (see Table I). Thus we may call y = 
(1, 1, 1, 1, 0) “‘line-Liebmann”’, and y = (1/a, 1/a, 1/a, 1, 0) “extrapolated line-Lieb- 
mann” and their eigenvalues have exactly the same relationships to the line-Richardson 
eigenvalues as the eigenvalues of ordinary Liebmann and extrapolated Liebmann have 
to those of ordinary Richardson. 

The remaining class of schemes, I’, , yield the mapping 

[A — (1 — a)][A — (1 — B)] = aBrdg . (6.9) 
This result has been examined and is in general found to be inferior to that of (6.8). 

The class T¢ . This class is determined exactly as the previous one by interchanging 
(¥:1 , Y2) with (v3 , ys). The mappings (6.8) and (6.9) are obtained for the corresponding 
classes I, and Iz, . The canonical reference scheme is y¢ = (1, 0,0, 1, 1), a line-Richard- 
son method, which is not an image of y,4 or yz . In the special cases of the next two sections 
it is possible to compare \, and X¢ and thus to determine which of the classes, ', or 
I. contains the best scheme. 

The essential difference between I’, and I’, schemes is the direction of the line along 
which three-term recursions must be solved. In all I’, schemes they are parallel to the 
direction of increasing 7 and in Ig parallel to the direction in which 7 increases. 

7. Reference eigenvalues for a special class of equations. We consider here those 
systems of the form (2.2) to (2.7) for which (L + R) and (B + 7) commute. Then the 
fundamental theorem and Theorem III are valid. However, before applying these 
results we shall examine some other very important consequences of commutativity for 
matrices of the indicated forms. 

By using the forms (2.6) in 

(L+ R(B+T) = (B4+ T)(L+ R) 
we obtain the conditions 


(L +R )B, ss B,(L, it inal, 2 j = q; 
(L, + R)T;, = TAL, + B41); l<j 


lA 


lA 
— 
| 


These conditions are necessary and sufficient for commutativity. With no loss in gen- 
erality* we may assume | B; | # 0 and | 7; | # 0 in which case the above conditions 
become 

BAL,_, + R,.,)B;° , 25554; 

4 

PAbis + 8,077 , l<j<q-1. 
Since the (L; + #;) are all similar they have the same eigenvalues and elementary 
divisors; and as the order of these matrices is p X p there are at most p distinct eigen- 
values p; , 1 < 7 < p. From the form (2.6) of (L + R) we see that its eigenvalues are 
just the p; , each of whose multiplicity has a factor q [i.e. if p; has multiplicity r, in 
(L; + R;) then it has multiplicity r,q in (L + R)]. If the p; are all distinct (or belong to 
simple elementary divisors) it can be shown from (7.0) that for some scalars a 


ee he l<j<q-l. (7.1) 


1 


(L; + R;) (7.0) 





*Only the boundary conditions may introduce singular B; and 7’, for elliptic difference equations. 
However, these cases may be eliminated, depending on the boundary conditions, by either not including 
the boundary values as unknowns (as in Sec. 8) or by requiring the difference analogue of the equations 


to hold at boundary points. 
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By changing the ordering (2.2), (2.5) from the original row-ordering to a column- 
ordering (which is equivalent to a similarity transformation of L, R, B and T) we find, 
as above, that the eigenvalues yu; of (B + T) are the eigenvalues of p similar g X q 
matrices. As eigenvalues of (B + 7) each u; has a multiplicity which is a multiple of 
p and there are at most q such eigenvalues. It is well known that commuting matrices 
have common eigenvectors. Thus in the above case there exist common eigenvectors 
e,; such that 


(L 4 Roye;; = pCi; ; (B + Te; ; = pei; 5 (7.2) 


for each p; and yu; belonging to different elementary divisors of (L + R) and (B + T) 
respectively. ; 

To determine some properties of the p; and u,; we consider the characteristic equations 
of which they are the roots. Exactly as in the proof of the fundamental theorem we see 


that 
| | 


| pl — (L; +R) | =| of — ah, -R, | 
for arbitrary x ~ 0. Thus taking x = —1 we find that if p is an eigenvalue of (L; + R,) 
then so is —p. Furthermore if p is odd there is at least one zero eigenvalue and additional 
zero eigenvalues must occur in pairs. Analogous results* are true of the eigenvalues 
uw of (B + T). 
We are now in a position to apply Theorem III. We select first y4 = (1, 0, 0, 0, 0) 
and using (7.2) we may replace the pair (p, , 4.) of the theorem by the pair (; , u;) to 


obtain from (5.5 


Nina = Pi tH. (7.3A) 
Similarly using yz = (1, 1, 1, 0, 0) we get 
hen ete , (7.3B) 
1 — p; 


and, finally, with yc = (1, 0, 0, 1, 1) Theorem III yields 


die = (7.3C) 
l — p; 
We have introduced an obvious double subscript notation for the \. These are all of 
the roots of the corresponding schemes since (7.3) holds for all p; and uw; . Using other 
schemes y in (5.5) would yield many other eigenvalues explicitly. However, any of the 
class Ty , 'y or I'¢ scheme eigenvalues are obtained by using (7.3) in (6.2) ete. 

As a further condition let us assume that the p; and yu; are all real. (A sufficient 
condition for this would be that (L + FR) and (B + T) are symmetric. Then B;., = 7; 
and by (7.1) the 7; and B; must be constants times the identity. Similar results would 
hold for the transformed (1 + R) block matrices.) Then by the parity of the eigenvalues 
p; and yu; (i.e. since max, | p; | = max, p;) we obtain from (7.3A) 


) | = Pmax + Mmax + (7.4A) 


*The properties of the eigenvectors (7.2) and the sign parity of the eigenvalues explains the ap- 
parent ambiguity pointed out in Theorem III, in which any of four seemingly different equations could 
have been used. They are not different but correspond to different labeling of the eigenvalues. 
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Furthermore, if pas, < 1 and ums < 1 we get from (7.3B, C): 


hee = ee (7.4B) 
— Pmax 

Amax.C coe 7a _Pmex (7.4C) 
i= Mmax 


Thus we may easily compare these principal eigenvalues to obtain 
a . 5 


as 


a) Pmax + Umax S 1 => , en < nee ’ Ee )s (7 5) 
/ ] ‘ | Y - 

b) | Pmax ~~ 1/2 j $ | Mmax ] 2 | = ———e - ae . 
Since the mappings of the eigenvalues of the schemes in T', , ', and I, from the cor- 
responding image scheme eigenvalues are the same, the best scheme will be found in 
that class for which X,,,. is smallest. The relations (7.5) may thus be used to determine 
the best class. It should be noted, from (7.5a), that in the present case if the Richardson 
method converges (Amax.4 < 1) then the line-Richardson methods converge faster, and 
if Richardson does not converge neither do the line-Richardson methods. 


8. Laplace’s equation. As an example we consider A*o = (0 in the rectangle 
0<2x2<1,0< y < 1; with ¢ = (given function) on the boundary. Using the mesh 
, l ; 
%;, = 1Az, Ar=-, On ¢<. p: 
r (8.0) 
, ] : 
y; = jAy, Ay.==-, 0s 4< 4; 
q 
and the difference approximations 0°¢/dx* = 1/A2° (¢;41,; — 26;; + ¢,-;.;), ete., at 


(x; , y;), we obtain the difference equations 


¢:; = 9.[¢; + dis1.4] + O,[bs.5-1 + Os, 541]; 4 ; (8.1) 


lA 
lA 
| 


Here 4; , .; , ;.9 and ¢;., are the given boundary values and 
= P,i i.@ dg 


Ay” - Ax” 
6. = 5, LAD? ears, Ghee s (8.2) 
2(Ar + Ay’) (Ax + Ay ¥ 
Equations (8.1) may be written in the notation of Sec. 2 where 1;; = r,; = 6 , b); = 
t;; = 6, and the inhomogeneous terms come from the boundary values; the system is of 


order [(p — 1)(q — 1)]’. The matrices (L + R) and (B + T) then commute and are 
symmetric, and as proved in Sec. 7, their eigenvalues are the same as those of respectively, 


c > 











0 1 0 0 1 0 
: @ = i g 

6. — and #6, 
LO 1 0) 0 1 OJ 





(—- 1) X@-—}) q@-)x@-D 
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The eigenvalues of these matrices are easily found to be 


p; = 20, cos (ix/p), l<i<p-1; 


(8.3) 
u; = 20, cos(jr/g), LSjsgq-1. 
The parity properties deduced in Sec. 7 are seen to be satisfied as p; = —p,-; and 
uy; = —u,-; . Thus the principal eigenvalues are obtained fori = j7 = 1, and we have 
Pinax = 20, cos r/p & 20, — O(x/p)’, (8.4) 


Minax = 20, cos r/q & 20, — 0,(x/q)’, 
where the approximate values hold for p > m and q > z. 

Let us consider first iterative solutions of (8.1) by means of the canonical reference 
schemes of Sec. 6 (all of which are simultaneous displacement methods). The principal 
eigenvalues of these schemes are obtained, since Theorem III applies, by using (8.4) 
in (7.3); then, as above, retaining only terms up to second order in 1/p and 1/q we have 


Naess CI — (2 + 4), [Richardson] 
Pp q 


i 


——— 1 — . (% + *), [Horizontal line-Richardson] (8.5) 
20.\p 


Amx.c V1 — cone (2 + os), [Vertical line-Richardson]. 
20,\p 
In this approximation we see that the direction of sweep, for minimum \,,,, of the above 
line-methods, is determined only by the mesh ratio and does not depend upon the number 
of mesh points in the z or y-directions (since both p and q were assumed large). If the 
number of points is ‘‘small’”’ the analogous criterion is obtained by using the exact 
eigenvalues (8.4) in (7.5). The rates of convergence (Sec. 3) of the above schemes are 
easily compared by recalling that —log (1 — e«) & e for small e. If the mesh is square 
6, = 6, = 1/4 and the line methods converge twice as fast as the ordinary Richardson 
(to second order in 1/p). If the mesh is rectangular then 0, < 1/4 or 6, < 1/4 and by 
sweeping in the proper direction further improvement is obtained (the implicit equations 
should come from lines parallel to the direction of largest mesh spacing). 
The successive displacement methods corresponding to the above reference schemes 
may be taken as (see Table I) 
vy. = (1,1,0,1,9), vy, = (1,1, 1, 1,90), y. = (1, 1, 0, 1, 1). (8.6) 


The eigenvalues of these schemes are related to the corresponding reference eigenvalues 
(since they are complete image schemes) by (6.3) with a = 8 = 1. We note as in Sec. 6 
that \ = 0 may become an eigenvalue of the schemes (8.6) independently of the values 
of \’. The remaining roots become \ = (A’)*, and corresponding to (8.5) we get, up to 
second order in 1/p and 1/q, 


Nmez.e V1 — an'( + 4), {Liebmann] 
q 


p 
. x (0. , 0, , ” 
Amex. 1 — — 1 + SI, [Horizontal line-Liebmann] (8.7) 
6.\p = q 
Lease Vl] 8 (9 + 9y , [Vertical line-Liebmann]. 
6, \p q 
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Of course the rates of convergence are now rigorously twice those of the reference scheme 
rates. The best sweep direction is determined as in the previous case, and again the best 
line method converges at least twice as fast as the ordinary Liebmann (to second order 
in 1/p and 1/9). 

The successive overrelaxation schemes which are usually applied to the above are 
Yo = (1/a,1,0,1,0), y, = (1/a, 1/a,1/a,1,0), y. = (1/a, 1,0, 1/a, 1/a), (8.8) 
and now (6.3) holds in each case with 8 = 1. As is well known [1] for y, above (which is 
extrapolated Liebmann) and hence for all these cases, X,,,, Will be a minimum when a 


is chosen as the smaller root of 


Nuax.xa —4a+4=0, (8.9A) 
where X = A, B or C. 
[That is, solve (6.3) for \ and set the discriminant equal to zero when Ay = Amax.a -] 
Then we have all |X| = Ajax Where 
Amz = a — il. (8.9B) 


Using the reference eigenvalues (8.5) in the above we obtain for the schemes (8.8), up 


to second order in 1/p and 1/g, 


‘6. Li f 0. ; 
Niece = 1 = 2n( < ee ss) + an & + a), [Extrapolated Liebmann] 

pg pq 

wit 7) ae (4 ) [Extrapolated horizontal |, 
m1 =~ (9/6) S ay ed hs (8.10 
Amax.s ("3 +r q + 6. \p" - q line-Liebmann] ) 
a eee . £ [Extrapolated vertical 

Amax ” p + qr ? 6, \p” ¥ q line-Liebmann]. 


There is an order of magnitude improvement in the rates of convergence of the extrap- 
olated schemes over the previous schemes. That is, since (0,/p" + 6,/q°) = Ax’ Ay*(Ax* + 
Ay’)~*, the present rates are O(Ar) or O(Ay) while from (8.5) and (8.7) the rates are 
6(Ax*) or 6(Ay’). The line schemes now improve the convergence rate by a factor 2’ 
for a square mesh or larger for rectangular meshes swept in the proper direction. 
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ON THE SOLUTION OF CERTAIN DIFFERENTIAL EQUATIONS 
BY CHARACTERISTIC FUNCTION EXPANSIONS* 


BY 
TSE-SUN CHOW 
Research Staff, General Motors Corporation, Warren, Michigan 
1. In this article we seek for the solution of the differential equation 
vu Ou Ou 
Po agi + Pt og + Pt = ay si 


where po , Pp: , P2 are functions of x, (x real) with the initial condition 


u(x, t) = u(t), t= 0, (1.2) 

and the following boundary conditions 
aula, t) + au(b, t) + asu,(a, t) + au,(b, t) = f(t), (1.3) 
B,u(a, t) + Bulb, t) + Bufa, t) + Bub, t) = g(d), (1.4) 


where a, , 8, are constants’, and u, = du/dx. Equation (1.1) is a differential equation of 
the second order of the parabolic type. Special cases of problems of this kind occur in 
heat conduction and diffusion, usually with simpler types of boundary conditions. 
Since the boundary conditions (1.3) and (1.4) are non-homogeneous and time-dependent, 
the method of separation of variables cannot be used directly. We shall first use a trans- 
formation” to remove the non-homogeneous boundary conditions and then separate 
the variables. This results in the well known Sturm-Liouville system and the solution 
of (1.1) --- (1.4) will be sought as expansions of the characteristic functions of this 
system. With arbitrary values of a; , 8; the resulting Sturm-Liouville system is in general 
not self-adjoint, and the characteristic functions are not orthogonal. Yet, it is known 
in the theory of differential equations that if we introduce the adjoint system, the 
characteristic functions of the two systems will be bi-orthogonal, i.e., the characteristic 
function of one system for one particular characteristic number will be orthogonal to 
all the characteristic functions of the other system with the exception of one of the same 
characteristic number. In carrying out the expansion procedure to find the solution of 
(1.1) --+ (1.4) we shall make use of this bi-orthogonality relationship and shall show 
in the final solution how the time-dependent funetions, f(t) and g(é) are related to the 
boundary forms complementary to those of the adjoint system. These are given by 
(4.3), (4.4) and (4.5 


*Received Feb. 1, 1957; revised manuscript received June 10, 1957. 
The author wishes to thank Professor R. C. F. Bartels of the University of Michigan for comments. 
He is also indebted to the referee for pointing out a somewhat misleading statement in the manuscript. 
1We assume that ow, o18 , 2 , 038 # O, Where oy. = a8. — arf; ete. 
24 homogeneous differential equation with non-homogeneous boundary conditions is equivalent 
to a non-homogeneous differential equation with homogeneous boundary conditions [1]. 

The theory of non-self-adjoint boundary value problems and the associated expansions of functions 
in terms of bi-orthogonal systems of characteristic functions does not appear nearly as well known as 
the theory of self-adjoint systems. Readers are referred to Coddington and Levinsen [2] for information 
on this subject. 
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2. To remove the non-homogeneous boundary conditions we make the substitution 


u(x, t) = g(r, ) + >) X.T; 


t=1 


in Eqs. (1.1) --- (1.4) where X, and 7; are functions of x and ¢ respectively. We get 


a*¢ ag _ a _ Siem 
Po ox? + D; ax + Pot ry i _ {X ,T% — T L(X,)}, (2.1) 
t(@, 0) = w(z) — > X,7,0), (2.2) 


a, f(a, t) + apf(b, t) + azf.(a, t) + a,f.(0, t) 


4 ¥ farX(@T(t) + X(T) + asXKaT() + auXKH)T)} = f), 


B, f(a, t) + B.f(b, t) + Bsf.(a, t) + B.f.(b, b) 


+ >> {8,.X(a)T(t) + BXADT A(t) + BXKIT AD + BXUOT A} = g(t). (2.4) 


t=] 
In these equations L = p, d’/dx”® + p, d/dx + p., {x = d¢/dz and all primes, the corre- 
sponding derivatives. We next choose X, such that 


(wv — a — 7? a rr oat 
=1, X,(b) = Xi(@) = X1(b) = 0, (2.5) 
X,(6) = 1, X(a) = X2(a) = X3(b) = 0. 

and furthermore 
= f oo ” ‘ / > 
T(t) = (Baf() — a2g()}/or2 , (2.6) 
T(t) = {a,g(t) — Bi f(t)} /ore . 
Then the boundary conditions to be satisfied by ¢(z, t) are 
ee t) + af(b, t) + a;f.(a, t) + asf.(b, t) = 0, (2 7) 
B, f(a, t) + Bog(b, t) + Bsf.(a, t) + B.g.(b, t) = 0. 


In the meantime a particular choice of X,(x) can be immediately determined by (2.5). 
Thus 








_(¢-b* ,@— d(e-<a 
i= Ete - 92. (2.8) 
\3 — 2 _ 
X,(2) = (¢—a) .(¢—a(z—b) (2.9) 


~ (b—a)’® * (b — a)° 
3. With X,; , 7; determined, the right-hand side of (2.1) is known completely, and 
we are led to consider the following ordinary differential equation associated with (2.1), 


LAWn) = Don’ + Piva + (D2 + An)¥n = 0, (3.1) 
with the boundary conditions 
[U, {vy} — a, Y,(a) + asW,(b) + as ;(a) + a4 7(b) _ 0, (3 2) 


Bivn(@) + Bovn(b) + Bsvn(a) + Bayn(b) = 0, 


Il 


LUs{ va} 
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where X,, is the characteristic number. Let V,, , Y,.2 be two fundamental solutions of 
(3.1), then the characteristic numbers \, are the roots of the following determinant: 


Uit¥a} Ust¥rad | 9, 3.3) 
|U2{Wa} Ust¥n2} | 


Now consider the following system which is adjoint to the original system defined 
by (3.1), (3.2): 


L*(Xn) = L*(xn) + AwXn = (Poxn)’” — (Pixn)’ + 2 + An)xn = O, (3.4) 
[Vs (xm} = Vixnl@) + Yaxa(b) + exé(a) + vaxi(d) = 0, 3.5) 
LValxn} = 5:xn(a) + S2xn(b) + Ssxa(a) + Sixn(d) = 0, 
where y, , -** , Ys , 6; , *** , 5, are constants. The characteristic functions of the two 


systems are y,(x) and x,(x). It is known in the theory of differential equations [2, 3] 
that w,(x), x,(x) are orthogonal in the interval (a, b). We further write f° ¥,x,dx = C, . 
These properties will be utilized in obtaining the solution of the system (2.1) --- (2.4) 
in terms of expansions of y,(z). 

We now expand the right-hand side of (2.1) into a series of y,(x). Let 


Aw, = T: . a,:Wn(2), (3.6) 
T.L(X,) = T. 2 baivala), (3.7) 


where 
b 
a; = [ Xx, d2z/C, ete.; 


assuming further that 


(x, t) = > F(t) ¥,(2), 3.8) 

and substituting (3.6), (3.7), (3.8) into (2.1), and collecting coefficients of y,(x) we have 
—r,F7,() — Fad = > {a,;,7% — b,,:T;}, (3.9) 

where use has been made of L(y,) = — A.W, , by (3.1). Upon integration of (3.9) we have 


immediately 
F.() = F,(0) exp {—A,¢} 


t 2 9 
- [ 1D a, T(r) — z=. bd} exp {—A,(t — 7)} dr. (3.10) 
d = roar 


0 


The coefficients F’,(0) are to be determined by the initial condition (2.2); thus 


> F,0) y(t) = u(x) — DS XAT O), 


t=1 
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and 


FO) = | {usta Pie £2) 70 bx) dx/C, . (3.11) 


Integrating [5 Re a,;Ti(r) exp {—X,(t — 7)} dr in (3.10) by parts and remembering 
that 
¢(z, i) = >> F,()v,(z) and u(x, t) = g(a, t) + ) ee ae 
n=1 i=] 
(1.4) in the following form: 


we obtain the formal solution of the given system (1.1) 


2 at 


u(z, t) = Ps ae exp {—A,?} | Uo(x)x (x) dx 
_ ° ie (3.12) 


. 


+ 2. 2 Le) / x(x)L,(X,) dx i T (7) exp {—),(t — 1)} dr, 
where X; and 7; are given by (2.8), (2.9) and (2.6). 

4. As written in (3.12) the solution contains the functions X; which are rather 
arbitrary: they have only to satisfy (2.5). These functions have already been determined; 
however, it is possible to eliminate them in the final solution. To this end we make use 
of the Green’s formula relating the two systems (3.1), (3.2) and (3.4), (3.5) 


b 


| 'y,L,(X,) — X,L*(x,)} dx 


= Dol b)xn(b).X {(b) — Dol b)xh(b)X ;(b) = pi(b)x,(b) X ,(b) +- p,(b)x,(b)X ,(b) (4.1) 


— p,(a)x,(a)X/(a) + pola)xi(a)X (a) + pi(a)x,(a)X (a) — p,(a)x,(a)X,(a) 


= U,{Xi}Valxn} + U2{Xi} Valxn} + Ust{Xi} Votan} + Us{ Xi} Vilxat, 


where U, {X,}, U, {X;} are linear combinations of X,(a), X;(b), X/(a), X/(b) and 
,,) are linear combinations of x, (a), x,(b), x{(a), x(b). U, , U2 and V,, V2 


Vs {xn}, Va tx 
have been defined by (3.2) and (3.5) respectively. Noting that 
L¥(xn) = 0, Vitxn} = Veixa} = 9, 
and remembering X,(x) has to satisfy (2.5), we have 
[ xoL.(X.) de = 6.{poa)xi(a) + pila)x,(a) — pr(a)x,(a)} (4.2) 
Jia 2 
+ §,.{—po(b)x{(b) — pi(b)x,(b) + pi(b)x,(6) } a:Valxn} + Bi Vsixn}, 
where 6;; , 6,2 are the Kronecker deltas. Substituting the expressions for 7’; as given 


by (2.6) and the result (4.2) just obtained into (3.12) we obtain the solution in the 


following form: 


aw (4) de 
u(x, t) p> = exp {- A,,.¢} | Uo X)X AL dx 


L(r)V, ) al 
tt ae ed | f(r) exp {-A,(t — 7)} dr (4.3) 
= W(t) Vatxnt f 
— 2 v, ar x | g(r) exp {—A,(t — 7)} dr, 


70 
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where 
: l 
Veix.} 7 — tae, [(po(b) — py(d))x,.(b) + pol(b)xi(b) ] (4.4) 
+ a.[(pi(a) — p,(a))x,(a) + po(a)xz(a)]}, 
and 
V lx} { 8, [(pi(b) (b))x.(b) + polb)xi(b)] 
47 = = wernae dl) ) == ) ) n ) Io ) n ) - 
Xn§ - 1 Pi x I Xn\Y, (4.5) 


+ B,|(pi(a) — p,(a))x,(a) + po(a)xi(a)]}. 
5. As an example of the previous discussions, consider the diffusion equation for the 


axi-symmetric case, a <r < b: 


a2 4) , 
3 +3 P ‘ ’ ating 
with the initial condition ; 
ulr, t) = u(d), t= @, (5.2) 
and the boundary conditions 
aula, t) + au(b, t) + asu,(a, 1) + a(b, t) = f(d, (5.3) 


B,u(a, t) + Bu(b, t) + Byu,(a, t) + Byu,(b, t) = g(t). 


By following the same procedure as outlined in the previous paragraphs we are led 


to consider the system: 


Litt} = (s ms oe r,s = 0, (5.4) 
dr rdr 
Oi }v,3 a, W,(a) + acy,(b) + asli(a) + aWi(b) = 0, (5.5) 


Uofyv,} = Bwv,(a) + Bov,(b) + Ba vi(a) + BWi(b) = 0 


i ld l — 
L*Sx,} (s s-->+a.+ 1), = 0, (5.6) 
dr r dr r 
Vitxnd = Vixnl@) + Yoxnlb) + vaxrla) + vsxn(b) = 0, (5.7) 
o.4 
Volx,} = dx,(a) + bx,(b) + daxl(a) + 8yx2(b) = 0, 
where y, , -*: , 6, are to be determined. Now the fundamental solutions of (5.4) are 
J (UPr), Yo(Al?r), being Bessel functions of the first and the second kind of the zero 
order. The characteristic function of the system (5.4), (5.5) is therefore 
¥.(r) = J(M?r) — EY,(Air), (5.8) 


where F is a constant and is determined by U, {y,} = U,. {y,} = 0. Similarly the funda- 
mental solutions of (5.6) are rJo(\’r) and rY,(A,r), and the characteristic function of 
5.7 


the system (5.6), (5 


) is 


x7) = rd (Ax7r) — E’rY (Ar), (5.9) 
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E’ to be determined by V; {x,} = V2 {x,} = 0. The Green’s formula connecting the two 
systems is: 


b 
i] baLn(Vn) — Val*(x,)} ar 


, 1 
= w’(b)x,(b) — wv d = 
eCB)xs(B) — YalD)x4(0) + F Val B)xa(8) — 


— yila)xa(a) + vala)xé(a) — = vala)xa(a) 


= Ui{ dn} Valxn} + Usldn} Valxn} + UslYn} Vatxn} + Ustdn} Vilxnt- 


Here U, {¥,}, U2 {vn} have already been defined, as by (5.5); if we take* 


U3{y,} = == {a a y,(a) + cava) ; (5.11) 


013924 O24 


U,{y¥,} = - = {6 Tus y,(a) + avo} ; (5.12) 


013024 O24 
we can find V; {x,}, --: , Vs {xn} by comparing the coefficients of y,(a), ¥.(b), ¥i(@), 
y/(b) in (5.10). This results” 
ant 1 a ad 1 a : P 
Vilxn} = aeos§|— — — )xn(a) — x(a)? + asois)\— — — )xnld) — xn(d)?, (5.13) 
a a b OQ, 


\ 3 


Vilxa} = Boor (* _ Bs) (a) — xn(a)¢ + aur (} _ B2),(b) _ xo} , (5.14) 


2 8 
ees Yr , — 
Vafxn} = — f{asx,(a) + asx,(5)}, (5.15) 
34 
P_% i I f f 4 ’ fae y 
} 41Xnj = i 1 Bsx,(a) + Bsx,(5)}. (5.16) 
34 
Now we introduce 
2,, = J(N7a) V(b) — J (M7) YA), (5.17) 
(Pla) = —a,M%, + agr12,, , 
Qa) = —a,2%, “ie 
nd a) A:;Q2o0 + a3An Qio , (5.18) 
\Rla) = a2 Qoo — asd Qui , 
| 
LS(a) = 219 — arn Qi ; 


( K(a) = ad (Axa) — asda J, (A, ), 
Lia) = Qod o(Xn *b) — asd, pf Ft te “b), 


(5.19) 
M(a) = a; ran “a) — asA., ai Fe *a), 
\N(a) = a> Y (A, *b) — asd, oh ff *b), 
4Of course, other forms of U3{w,} and Us{Wn} may be chosen as long as Ui {yn}, — — — , Uslya} 


form an independent set in the quantities y,(a), ¥.(b), ¥4(a), ¥A(5). 
5 Equations (5.15) and (5.16) are equivalent to (4.4) and (4.5) if we put in the latter po = 1, pi(a) = 


1/a, ete., and make use of Vifxn} = Velxa} = 0. 
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and similar expressions for P(8), --- , N(@). Also let ¢ = bo,;/ac., , then, using the fact 
that Ji(MNV?r) = — 0? T(r), ViAL?r) = — d,¥1(A”r), where the prime denotes 
the derivative with respect to r and that J,(7r)¥,(N?r) — Y,QAl?r) Jo(?r) = 
2/mrd.’r we obtain, after some algebraic details, 





























+ SOT - MEG om 
WO) = Srey F Na) ~ MOV ENG) — 
r- Rote. a scue. om 
lw efi tial eam, a 
xs) = OTE aN)” MB) oN) 6.28 


Furthermore, since y,,(r) and x,(r) are orthogonal in the interval (a, b), we have 
b 
[ Yun()xr) dr =0, men, 
and 
b 
[ ¥Alr)Xn(r) dr 


b? {P(a) + 2a, ‘tr? b} {P(a) + 2a.0/mn *b} + {Qla) + 2a,/rb} {Q(a) + 2a,o/rb} 








2 {M(a) + N(a)}{M(a) + oN(a)} 
_ a’ {R(a) + 2a;/ra} {oR(a) + 2a;/ra} + {S(a) + 2a, lta) fo S(a) + Qa, /mri*a} 
2 {M(a) + N(a)}{M(@) + oN(a)} 
= C,. (5.26) 
In.the last expression we can replace all the P(a), --- , N(a) by the corresponding 
P(8), --- , N(8). The characteristic numbers 4, are the roots of the equation 
1/2 ~ 713 oh) 
Ox4Qi1A, — (044201 + 732210)Xn + 912200 + w\a + Fe 0. (5.27) 


With these preliminaries the solution of the given problem can be written down according 
to (4.3). We note in particular that the system (5.4), (5.5) becomes self-adjoint if ac., = 
bo;; , i.e., if ¢ = 1, in which case E = E’ and x,(r) = ry, (r). 

6. We now consider a special case of the previous example. Let 


jo = —I, YS > 0, a = a, a= 0, (6.1) 


la, = 0, Bo=1, B=0, &=B, 
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then the boundary conditions are reduced to 


—u(a, t) + au,(a, ) = fd, (6.2) 


u(b, t) + Bu,(b, t) = gb. 


This problem arises in the evaluation of transient temperature distribution in a homo- 
geneous hollow circular cylinder, a < r < b, when the gas temperatures inside and out- 
side the cylinder are functions of time, being f(¢) and g(¢) respectively, and the initial 
temperature distribution in the cylinder is u(r). The constants a and £@ are associated 
with the heat transfer coefficient and the thermal conductivity of the cylinder material 
(here a, 8 > 0). Then ac., = bo,; = 0, and the system is self-adjoint. Here we have 


JM? a) + ann? Si (Ma) T(r *b) — BM "Jin *b) 


RB = E’ —} = > i > ~ i > —— a > 79 ~ > ° (6.3) 
Yo(An"a) + adn Y,(A,"a) Y,(\,"b) — BAY (4b) 
' l 1/2 see cet hs ; 
v,.(r) = —x,(r) = Jo(A, 1) — EY .(A, 7) (6.4) 
- 
(a) 1 (a) 2a/ra Qo — BArw’? Qo: (6.5) 
= -yx,(\@) = —- Tn — aE 3 ee Jiu 3 oO 
. a * Y,(\:7a) + od! Y,(0a) ~~ -Y0(A2b) — BA? Yb) 
o. alas 28 /xt 
y,,(b) == bi) = es ie > : ij27 ij2 => > “T72 B 95 ~ i72 (6.6) 
b Yo(A, a) + ar,’ Y,(A,’ a) YA. 8) — Bo YL DD 
ab ab 
C”", = | v,(r)x,(r) dr = | ry.(r) dr 
; : ) (6.7) 
a @ 1+ 6. _ 7 l+ar, | mn 
wr, |{Yo(r1/7b) — Baw’ Y,(A,’"b)}° (V¥(04a) + adi’ Y, (02) }*) ’ 
and X, is a root of 
ABQ2:s:An — (aQip — BQ)? — Qo = O. (6.8) 


Furthermore, from (5.15) and (5.16) we have 
Vaixat = xv lb)/B, 
Vialxn} = —xn(a)/a, 


so that the solution of this special case can be written down, according to (4.3), 


nb 


u(r, t) = ys ve exp (—A,?) | ru(r)y,(r) ar 


va 


«= = Yair) ¥nKa) | f(r) exp {—A,(t — 7)} dr (6.9) 


CO t 
Q n=1 n /0 


b — v,(ry,(b) f° 
+->> vr Ym ” [ g(r) exp {—A,(t — 7)} dr. 
B n=1 ( n 70 

The solution of this special case for f(t) and g(t) equal to constants has already been 
given by Carslaw and Jaeger [4], using the method of the Laplace transform. It is easy 


to verify that the result presented here is the same as that given in [4]. For, if g(r) = 
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>2., Aw,(r), a <r < b, then 


if 1 , 1 
A.==> re(r)y,(r) dr = —>— rote) ve! + — vi )ar 
( [ Cae / r (6.10) 


ne 


b 
= at {. ob) ¥a(0) + = g(a)¥n(a) + i re" (Vile) ir}, 
where use has been made of — y,(a) + af{(a) = ¥,.(b) + 6Y4(b) = 0. By putting g(r) = 1 
and log r successively in (6.10) we can find the summation of the series }>2% 
v,(r) ¥,(a)/C,r, and >°®., ¥,(r) ¥,(b)/C,A, . Replacing f(t) and g(t) by the constants 
f and g respectively in (6.9) and using the results just obtained after performing the 
integration, we get 


ab 


- ( _ 
u(r, t) = ps ver exp (—A,?) | ruo(r)v,(r) dr 


n=1 


_ afb log b/r + B} + bgia log a/r — a} (6.11) 


ab log b/a + aB + ba 


+ 5 HO Led yay — 12 yc} exp (—a0, 


which is the form given by Carslaw and Jaeger. In this expression y,(r), ¥,(a@) and 
v,(b) are given by (6.4), (6.5) and (6.6) respectively. 





n=1 
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BOOK REVIEWS 


Théorie des Circuits de Telecommunication. By Vitold Belevitch. Librairie Universitaire 
Louvain, 1957. viii 384 pp. $9.00. 


This is a well conceived exposition of electric circuit theory for the use of engineers having the usual 
mathematical preparation. The contents of the book are indicated by the tiiles of the fourteen chapters, 
which can be translated freely as follows: I, Linear systems; II, Analysis of Kirchoff networks; III, 
Energetics of passive networks; IV, Reflection and transmission; V, Scattering matrix; VI, Image param- 
eters; VII, Analytic theory of passive networks; VIII, Image theory of low-pass filters; IX, Bandpass 
filters; X, Filters with pre-determined attenuations; XI, Inductances and transformers; XII, Ampli- 
fiers; XIII, Transient phenomena; XIV, Complements on the synthesis of passive networks. 

Thus the book is devoted chiefly to the steady state theory of circuits consisting of discrete elements. 
It appears that the author’s knowledge of this subject is very extensive, and that he has made many 
significant contributions to it; and, so far as the reviewer can judge, he expounds the subject with great 
competence and thoroughness. On the other hand, the parts of the book (notably Chapter XIII) which 
deal with other branches of circuit theory seem to be too sketchy to serve as more than rudimentary 
introductions. 

As has been indicated above, the mathematical preparation expected of the reader is quite modest. 
Not only does the mathematics used not go appreciably beyond that usually taught in engineering 
curricula, but also in many places intuitive physical arguments are used, instead of rigorous mathe- 
matical demonstrations, in the derivation of formulae. On the logical level which the author has chosen, 
the exposition is generally satisfactory. However, there are occasional slips which may give unwary 
readers trouble. Thus the author states the Hurwitz stability condition (i.e. the condition that the 
real parts of the roots of an algebraic equation be negative) in a special simplified form which applies 
only to the case in which the leading coefficient in the equation is positive, but he does not mention this 
restrictive assumption concerning the coefficients. Again, in the discussion of Fourier integrals, the 
author states that the absolute integrability of f(t) over the interval (— ©, ©) is a necessary condition 
for the representability of f(¢) by a Fourier integral. 

There is an extensive list of errata. However, this is concerned mainly with trivial typographical 
errors, and does not seem to touch on any of the more serious faults. 

On the whole, the book affords a comprehensive and lucid exposition of a large and important 
part of curcuit theory, and it should be welcome to engineers who are concerned with the practical de- 
sign and use of circuits. An excellent feature of the book is a fifteen-page critical bibliography. This 
will afford adequate guidance to readers who wish to study the subject more deeply and extensively. 


L. A. MacCouu 


Wahrscheinlichkeitstheorie. By Hans Richter. Springer-Verlag, Berlin, Gdéttingen, 
Heidelberg, 1956. xi 435 pp. $16.65. 


The author states in the preface that his objective is to supply a remedy for the lack of any modern 
German text on probability theory. He seems to have accomplished not only this modest goal but also 
to have made a valuable contribution to the international literature. 

The reader is assumed to have some knowledge of analysis and linear algebra plus some appreciation 
of pure mathematics. Starting at this relatively low level, the author devotes about half the book to a 
rigorous development of the experimental and mathematical foundations including the necessary back- 
ground from the theory of measure and integration. The last half of the book includes discussions of 
random variables (expectations, etc.), properties of common types of distributions (Poisson, Gaussian, 
etc.) and finally some of the well-known limit theorems (laws of large numbers, zero-one laws, central 
limit theorems, etc.). 

Although the book covers only the above limited range of topics, it treats these very thoroughly. 
There is little if any discussion of applications or more advanced topics such as Markov chains or sto- 
chastic processes, 

G. F, NEWELL 
(Continued on p. 258) 
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TOROIDAL WAVE FUNCTIONS* 
BY 
V. H. WESTON 


University of Toronto 


Abstract. The Helmholtz equation is solved in toroidal coordinates. A complete 
set of solutions is obtained representing radiations from a ring source. 

Introduction. Up to now the Helmholtz equation has been solved only for separable 
coordinate systems. This paper presents solutions for a non-separable coordinate system, 
the toroidal. The main interest will be with continuous, single-valued solutions satisfying 
the radiation condition and possessing a ring singularity. Exact expressions for each of 
the wave functions will be obtained in the form of a series expansion and integral over 
finite range. The series expansions are uniformly convergent everywhere in space except, 
of course, at the ring singularity. 

The time dependence will be of the form exp (—iwf). 

1. Series solution of the Helmholtz equation in toroidal coordinates. The relation 
between toroidal and cartesian coordinates systems is given by [5, p. 151] 


d sinh ¢ cos @ 
cosh — cos 7’ 


_ dsinh ésin ¢ 
cosh — cos 7’ (1.1) 


dsin 7 ; 
cosh § — cos 7 
Domains of the coordinates are 0 < » < 27,0<@< 27,0 < = <fo where — = &) 


defines a torus 





z’ + (p — d cothé,)? = d’ esch’ & 
and 7 = m defines a sphere 


(zg — d cot m)? + p’ = ad’ ese 


where p = (x° + y’)' 
The metric coefficients are given by the following relations 
A, =h, = 
£ ” ? 
cosh — — cos 7 
(1.2) 
d sinh é 
hk, = ——— 
cosh § — cos 


From now on, in order to facilitate analysis, the variable s will be used instead of & 


where the two are related by the equation cosh ¢ = s. 
Now it has been shown [7], that for a certain class of non-separable rotational coordinates 


(u; , U2 , o), there are solutions of 
Vy+ky =0 (1.3) 
*Received May 1, 1957; revised manuscript received July 2, 1957, 
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given by 
(uu; ,Ue,¢) =e” rw a,(u;)[hs(u, , W.2)]" (1.4) 


e'**B(u;) 2, b,(u;)[hg(u, , U2) 1" 


V(r , U2 , ) 


where 7 + 3, andi # j + 3, provided that the metric coefficients h,; and h, , and the 
function B(u;) satisfy certain conditions. The coefficients a,(u,) and b,(u,) of the above 
series satisfy a recurrence set of ordinary differential equations. 

In particular, the toroidal coordinates (&, 7, ¢) belong to this rotational class and 
the solutions are such that the power series given by (1.4) have a lower termination. 
If the power series are expressed in the variable (s — cos n)~* instead of h, the differential 
equations involving the coefficients of the power series take a simpler form. The function 
B(n) for toroidal coordinates is sin y. Solutions of (1.3) are given by 


y.(s, n, @) =e” oll 3: A,(s) (s — cos yn)” (1.5) 


7 


and 
Yo(s, 7, ¢) = e'” sin be B.(s) (s — cos yn)". (1.6) 
r=7 


The coefficients must satisfy the differential equations 


(s? — 1)Al’ + 228A? — 4 ror + 1) + ~f. c|+e a ‘ 1.7) 
g§= (1d 
— (2r — I[(e — 1)As_, — s¢ — DA,-,] = 0 
(s?> — 1)B’’ + 2sB' — B E —-1)+s5 f. | + k° d’B,_, 
s — ] (1.8) 
— (2r — 1)[(s° — 1)B’_, — s(r — 2)B,_,] = 0, 


where the prime denotes differentiation with respect to s. The numbers 7 and 7” are 
determined from the boundary conditions. The problem of solving a partial differential 
equation in three variables in which only one variable is separable is reduced to solving 
a recurrence set of ordinary differential equations in one variable. 

The homogeneous equation corresponding to the equation (1.7) is 


(s? — I)w’” + 2s’ — rr + Iw — 5 c 7w =0 (1.9) 
of which, the associated Legendre functions P*(s) and Q*(s) are solutions. The function 
w’(s) will be used to represent both solutions of (1.9). 

It is immediately evident that for r = T in (1.7) and r = 7” in (1.8), the non-homo- 
geneous portion of the equations vanish. Hence one has A; = w7(s) and By’ = w>,_,(s). 
Since the solutions of the corresponding homogeneous equations of (1.7) and (1.8) are 
known, the equations can be solved easily, resulting in the following relations, 


: =< 2 or =f Pula)utla) de (1.10) 


A,(s) = w*(s) 
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and 
lz 
Be) = wt _,f ne | Clie? fe) de. 1.11 
: oi asa, Jest_1(2) (1.11) 
where 
F.(s) = k? d?A,_, — (2r — 1)[(s? — 1)A’_, — s(r — 1)A,-4] (1.12) 
G(s) = k*? d°B,_. — (2r — 1I)[(s’ — 1)Bi_, — s(r — 2)B,_,). (1.13) 


The constants ¢, , ¢, , c{ and ci are determined from the boundary conditions. 

Define a basic set of solutions ¥%,(P) and ¥4,(Q) such that A,(s) is equal to P7(s) 
and Q*/(s) respectively and the constants of integration in the expression for A,(s) are 
taken as fixed numbers. 

Let ¥’,(s, 7, @) be a solution of Eq. (1.3) of the form (1.5) with arbitrary constants 
of integration and with A7(s) = a,P7r(s) + b,Qr(s). Then 


vir(s, 1, ¢) — aWirl(P) — bivir(Q) 
is a solution of Eq. (1.3), with the coefficient of (s — cos »)~* vanishing. Thus we have 
Vir(s, 7,¢) — avir(P) — biWir(Q) = Virsils, 0, ¢), 


where y¥4,.,(s, 7, ¢) is a solution of Eq. (1.3) of the form (1.5) where the lower limit of 
» ° as rae ss -T-1 > 

summation is 7 + 1. The coefficient of (s — cos n) "in ¥7.,(s, 7, ¢) must be of the 

form a,P>,,(s) + 6.Q7.,(s). Hence in a similar manner we have 


Yir(s, 75%) — avic(P) — aWirsi(P) — b¥i7(Q) = bWir+1(Q) = Virsals, n, >). 


By mathematical induction we see that any solution ¥%,(s, , @) with arbitrary constants 
of integration in the expressions for A,(s), is just a linear combination of ¥%,(P) and 
y.,(Q) where p = 7, 7 + 1, T + 2, --- . A similar discussion follows for the solutions 
of the form ¥5,-(s, 7, ¢). So no restriction is placed if we take the constants in the integrals 
(1.10) and (1.11) as pre-determined fixed numbers, thus allowing us to obtain explicit 
expressions for a set of solutions of (1.3). 

Any solutions of the form {y¥%,(s, n, ¢) + Wor-(s, n, ¢)} with arbitrary constants of 
integration can be expressed as a linear combination of the explicit solutions y¥%,(P), 
y*,(Q), vo, (P) and y,-(Q) where p = T, T + 1, --- and p’ = T’, T’ +1, 

From now on we shall be interested in solutions periodic in the angle ¢. Hence we 
set u m where m = 0, +1, +2,:-:-. 

2. Obtaining explicit expressions for y7,(P) and y7,.(P). The wave functions 
y”,(P) and y™,.(P) are defined as those solutions satisfying (1.5) and (1.6) respectively 
where the coefficients A,(s) and B,(s) are given by (1.10) and (1.11) with the constants 
ci all set equal to unity, sy 1,(s) and B,y.(s) equal to P;'™'(s) and P7:™;(s) 
respectively. The negative superscript is taken so that our results include the case in 
which T and 7’ — 1 are integers such that | 7| < | m| and! 7’ — 1| < | m|! when the 
associated Legendre functions P)"'(s) and P"!, do not exist. 

For convenience the symbol J will be used to signify | m |, where m is a positive 


ee a ae 


or negative integer or zero, i.e. 


= |m| = 0,1, 2,3, --- (2.1) 
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The integral operator J(M, r) operating on the function F(s) is defined as follows: 





I(M, r)F(s) = w*(s) [ 1a ae r [ w(x) F(x) dx. (2.2) 

Hence the coefficients A,(s) and B,(s) for ¥7,(P) and yj7-(P) satisfy the relations 
A,(s) = 1(M,r)F,(s), (2.3) 
Bs) = I(M,r — 1)G,(s), (2.4) 


where F,(s) and G,(s) are defined by (1.12) and (1.13). To calculate A,(s) and B,(s) 
the following lemma* is needed: 
Lemma: If 


G(M, X, X — R) = (8? — 1°? Px“), (2.5) 


where X is a positive, and M a non-negative integer and if 
f 2 ‘ d Y r r >) > Y r , ) 
F(s) = aj (s' — 1) 7, AM, X —1,X — R) — (Rk — 1I)G(M, X — 1, X — R) 
ax 


+O, X -1,35 ~-R+ 0, 


then 


1(M, R)F(s) = —[a(M + 2X — R — 1) + bIG(M, X, X — R)[2X]". (2.7) 


Now A;,(s) = P7"(s) = P2P_.(s) = G(M, 0, —T) hence setting r = T + 1 in (1.13) 
and using the fact that A7_,(s) = 0, one has 


Frsi(s) = —(27 + p| — j) G(M,0, —T) — sTG(M, 0, -n |. (2.8) 
as 
But 
Arsi(s) = 1(M,T + 1)Fr.,(8). (2.9) 


Now F.,(s) corresponds to expression (2.6) where 
a= —(27 + 1), b = 0, X =i, R=T +1, 
hence the lemma gives A,;.,; from (2.9) and (2.7), resulting in the following relation 


Ars, = —2°'[-(27 + 1I)(M — T)JG(M, 1, —T) 


(2.10) 
= (T + 4)(M — T)(s’ — 1)'?P~7-1(8). 
To obtain A,.,(s) one must break the operation 
Ar+.(8s) = 1(M, T + 2)F 7,2(8) (2.11) 


into two separate integrals, i.e. F7,.(s) must be broken up into the following two ex- 
pressions 
—(2T + 3)[(s? — 1) Ata — 8(T + 1)Arai] (2.12) 


*This is a consequence of Lemma 2, page 13, [6]. 
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and 
ed’ Ar. (2.13) 
Expression (2.12) reduces to the form given by (2.6) where 
a = —(2T + 3)(T + 3)(M — 7), b=0, X = 2, R=T+2 
and (2.13) reduces to the form given by (2.6) where 


a = 0, b= kd’, X = 1, R=T +2. 


Hence using (2.11) and (2.7) one obtains 
Ars. = —[—(27 + 3)(T + 4)(M — T)(M — T + DIG(M, 2, —T)4"' 
— k? d’G(M,1, — T — 1)2* 
mens (2.14 
UT + 3/27 + 2(M — TM — T + 1s — )P~%9) ) 
2 32 

_ ae (s? — 1)'“?P-F-2(8). 
Rather than calculate the remaining A,(s) in a similar manner it is better to obtain 
them through mathematical induction. Assume that 


((r—T)/2) 
A(s)= >> (kd)*A'G(M,r—T-t,-T- 9, (2.15) 


t=0 


where A} are constants. In the expression for the upper limit of summation, the following 


notation is used: [x] is the integer such that x — 1 < [x] < x. The above expression 
obviously holds for r = T, T + 1 and T + 2. 
Assume that the expression holds for r — 1 and r — 2, hence in order for it to hold 


for r one must have 


A,(s) = 1(M, r)F,(s), (2.16) 


where from (1.13) and (2.15), F,(s) becomes 


Fiy=kh@ YS (kd"A_G(M,r-2-T-t-T-0 


— (2r — p| « — 1) © s(r — »| 
ds 


r-T-1 2) 
> (kd)*AlL.GM,r-1-T-t,-T- 0. 


The coefficient of (kd)** in F(s) is 
A,oG(M,r-1—-—-T-—t,-—T—t+4+1) 
— (2r — p| « — }) : — s(r — » |. GM,r-1-T-t,-T-— 2. 
( 


s 
° » 7 ’ . . ‘ ) -1 
Hence the coefficient of (kd)*' in I1(M, r)F,(s) is, on setting a = —(2r — 1)A}_,,b= Ajz2, 
X =r-—T—t,R = rin expression (2.6) and using (2.7) 


—_ 


ot .. Met» ; Mer F =F 8 
[((M — 2T — 2t+r— 1)(2r — 1)A‘_, — ALR] — a oT) (2.17) 
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Equating coefficients of (kd)*‘ in (2.16) one obtains the relation 
A'@(M,r-T-t,-T-d 
= [(M — 2T — 2¢+r— 1)(2r — 1)A‘_, 


GM,r= 7-4 —-2-1 
2r—T t) 


(2.18) 
~ A) 


which shows that the expression (2.15) holds for r if it holds for r — 1 and r — 2, provided 
that the constants A; satisfy the equality 


A‘ = [(M — 2t — 27 +r — 1)(2r — IAL, — AL2°r —T- dd". (2.19) 


Thus one obtains 


sg, Se *"T(— T — t+ 9l(— 27 +r —-— 2t+M) (2.20) 

si (ir — T — 2t)")IT(—r 7 + 4)r(M — T) ia 
This expression for the constants A‘ holds for r = T, T + 1, T + 2 as is seen when 
comparing the values given by (2.20) for the casesr = 7+ 1,t=0;r=7+2,t = 0; 
and r = T + 2,¢ = 1 with values of constants in equations (2.10) and (2.14). Since 
the expression (2.15) holds for r = 7, T + 1, T + 2 and it has been shown that if it 
holds for r — 1, 7 — 2 then it will hold for r, one can, by mathematical induction, con- 


clude that (2.15) holds for every r. 
Hence one can immediately write 


T)/2 


y"7(P) =¢ >» (s — cos ny)’ >. (k d)*' A‘(s’ — 1) hide: = “ig ‘(s). (2.21) 


r=7 t=0 


— a ies -— 
Noting that B;.(s) = P_;:(s) we can obtain in a manner similar to the above the 
following explicit expression for Yj, (P) 


Wor (P) = e*”* sin 7 >. (s — cos 7)’ b (k d)** (2.22) 


where the constants B, are given by 


at ie its (= 7’ — t+ 5)M(— 27’ +r — 2t+ M +1). 9 92) 
ia re ee eee ee ee ae aie 


The expressions (2.21) and (2.22) can be placed in a neater form. In (2.21), inter- 


change the order of summation of r and ¢, obtaining 
x r—T)/2 x Pa 
r= 7 t=0 t=0 r=2t+7 


(The validity of this operation will be shown below.) Then replace the summation over 
r by summation over ¢ where o = — 2t + (r — T). One then has 


Pe 4 = (k d)*‘a.(s* = gyerre pee 
Yer(P) =e" 2.  »  <aeeee F Pees (s), & 


t=0 o=0 (8 — COs n) 


bo 
bo 
wae 
wS 
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where 
«_ (—1)(M — T)(T +t + 5), . 
hia 2‘(t) (a)! omy 
Similarly ¥7.(P) reduces to the form 
2 x a 2tpt7,2 (o+t)/2 
vir(P) = e™ sing SEO ble — pew), (2.28) 
t=0 o=0 (s — cos n)- . 
where 
»  (—1)(M +1 — T")(T’ + t+ 35) ™ 
b! = : fe eee 
. 2'(t) (a)! aad 
To show convergence of the above series we require the following inequality 
P>“(s) < (8s — 1I)"7(s + 1)° "Is (8? — 1)'”)]’/TU + M), (2.28) 


where the positive and negative signs are taken when vy > 0 and O > » respectively. 
We shall represent the series expansion in (2.24) by the following 


e7{P) = e'™ + ¥C,4, »). (2.29) 


t=0 @a=0 
Using (2.28) and the inequality 


(s — 1)\(s — cos n) -— % 


which holds for 1 < s < © andO < 9 < 227 we can show that the series bm | Cre(8, 0) | 
is dominated by the absolutely convergent series D,(s, ) ; An » | d,, | where 
|(M — T)(T +t + .5), | 


die | = (1+ M+ 0, (2.30) 


and 
m 


D.(s ) = (: —_— ‘) 2 k?'d?"(s awe 1)‘[s + (s? — 1)'"}'* 
vi Net 2" 


Ord + M + #(s — cos n)***" 


where the positive sign is taken when ¢ + 7 > 0, negative sign otherwise. 
For large t, (t + T + 3 > 0), we have 


ae ee eS K [1 + o(4) , (230) 
: ri+t+T7)\ 15+ M — T) t 
where K is a consiant which vanishes if JJ > T. 

Using (2.31) it can be shown that the series >: = 5 » D.(s, n) | d,, | is absolutely 
convergent for every value of kd and every value of s and 7 in the ranges 0 < s < @ 
and 0 < » < 2z. Since the series :  & ¢,, | is dominated by > > D,(s, n) | die | we 
can say that the original series given by (2.24) is uniformly and absolutely convergent 
forl <s < @ andO < n < 2m. Hence the change of order of summation above is valid. 


A similar discussion follows for the series (2.26) except that the region of uniform 
convergence 1 < s < © andO < 7 < 2z, holds only if JT’ — 1 — M is a non-negative 


integer. For other values of 7’ — 1 — M the region of uniform convergence is 1 < s < © 
and 0 < » < 2z, the wave functions y7,.(P) being discontinuous along the lines 7 = 0 


and 7 = 2. 
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3. Determination of T, T’ through continuity conditions. We will be concerned with 
solutions of the Helmholtz equation for the exterior problem, that is, solutions which 
are non-singular and continuous in the region external to the torus s = s, , where by 
external we mean 8 > s > 1. The region s,) < s < @ describes the surface and interior 


of a torus. The limiting torus s = © describes a ring with radius d and centre, the origin. 


As is seen from (1.1), a portion of the surface of the torus s = 1 is the z-axis. The rest 
of the surface extends throughout infinity in all directions, i.e. if R is the spherical polar 
coordinate, then R = o for torus s = 1. 


We shall eventually consider solutions satisfying the radiation condition, but before 
doing so, we must consider the effect of applying the conditions of continuity and non- 
singularity in the region s. > s > 1. Since Q7(s) is singular on the surface s = 1, solutions 
involving the associated Legendre functions Q7(s) will be singular there. Hence it 
seen that solutions which are non-singular in region 8s, > s > 1 can be formed only from 
y",(P) and ¥7,.(P) solutions. The condition of continuity shall be applied to the y7",(P) 
and y3,-(P) solutions. 

As was mentioned above y¥j,.(P) is discontinuous at 7 = 0 unless 7’ — 1 — Misa 
non-negative integer. Even though the other wave functions are discontinuous at » = 0, 
it is possible that there exists a linear combination of them which is continuous at 7 = 0. 

We will be concerned with solutions odd in the variable 7, of the form 


v72(s, 7, ¢) = >, a,Wenr(P), (3.1) 


where p increases by integral values only. The coefficients a, functions of kd only, are 
normalized such that there exists at least one coefficient which does not vanish when 


k is zero. Let 
T'+N’ 


[vo(s, 2, bh-o = Dy lYor(P)h-o , (3.2) 


p=T 


where WN’ is an non-negative integer. We require ¥5(s, 7, ¢) to be continuous at 7» = 0 
for every s and kd in the ranges 1 < s < ~ andO < kd < o. Hence a necessary con- 
dition for continuity is that the Laplace portion of ¥%(s, n, ¢) must be continuous at 
n = 0. Since W7(s, 7, 6) is an odd function of 7, the following must hold for 7 approach- 


ing zero 


[Wols, 1; ) lk-o ~ 0(n) l<s< oe. (3.3) 


Now it can be shown that [¥j7-(P)].-. has the value as 7 approaches + 0 

3 — 1\"” (Qn)'/(g — 1)72-7" 
[vor-(P)].-0 ~ (e ) "a ro ——;—__— -+- ((7). (3.4 
Yor s+ 1 r—-7 +1+M™M)r(1 7 " (3.4) 
1 


The term independent of 7 in (3.4) is zero only if 7’ — 1 — M or — T” — 3 
negative integer. Because of the factor (s — 1)~” there is no linear combination of 
Yor (P) which will satisfy (3.3) unless T’ — 1 — M or — T’ — 3 isa non-negative integer. 
Hence in (3.2) the lower limits of summation may have the values M + 1, M + 2, 

l 


or — 3, — 3/2, --- . In the latter case the upper limit of summation is — 3. 


The even solutions of 7, ¥”,(P) have all been shown to be uniformly convergent for 
Ul eT » 


the region 1 < s < ~ and O < 7 < 2z, and hence are continuous everywhere in the 


region. However, if we differentiate term by term the series given by (2.24) with respect 


is a non- 
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to the variable », the resulting series can be shown to be uniformly convergent in the 
region 1 < s < @ and 0 < 9 < 22 and discontinuous at 7» = 0 or 27, except when 
T — M is a non-negative integer. In this case the differential series is continuous at 
n 0 and n = 2r. 

We will be concerned with obtaining a necessary condition for continuity of the 
partial derivative with respect to 7 for the even solutions ¥”(s, 7, ¢) where 


v"(s, 0, ¢) = >. BWo(P) (3.5) 
and which has the value when k vanishes 
T+N 
[y"(s, n; >) |eno ” > > B,[Wo(P) ho * (3.6) 
p=T 


Using the asymptotic relation for 7 — + 0 


” —* M/2 i. -1/3-T 
| 2 eu) | ~ -( ‘) _ a fe | maa 5) + On) (3.7) 


s+ 1 n(—-T + M)I(T + 
we can deduce in a manner similar to the above that the lower limit of summation in 
(3.6) may be M, MJ + 1, M + 2, --- or — 4, — 3/2, --- , and in the latter case the 
upper limit of summation is — 3. 


Imposing the above conditions of continuity on T and T” restricts the number of 
solutions. 
To simplify further work y7\(P) and y7jy(P) shall be defined by the relations 


Vin(P) = (k dy Wren) (P), (3.8) 
Wow(P) = 27°7(k dy Wa usnaiy AP), (3.9) 


where in (3.8) 7 is replaced by (MW + N)/2 and in (3.9) T’ is replaced by (M+ N + 1)/2. 
Later on we will be concerned with wave functions of the form 


> a, W*(P), (3.10) 
> 8,vA(P), (3.11) 


where a, and 8, are independent of kd. Substituting in the expressions (3.8) and (3.9) 
and normalizing we see that on employing the necessary conditions for continuity we 
obtain for (3.11) 


N=M+4+2l or —-N-1=M+42l 
and for (3.10) 
N=M4+2/4+1 orf —-N-1=M4+214+1, 


where / = 0, 1, 2, 3, 

4. Asymptotic value of ¥";(P), ¥7\(P) when d > 0. The main problem: is to find 
the linear combination of w7\(P) and yWjx(P) which represents outgoing radiation 
from a ring source. A necessary condition for this is that the wave function represent 
radiation from a point source when the radius d of the ring approaches zero. Hence 








a 
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one must first consider the asymptotic values 


lim W7\(R, 0, 4), 


d-—0 


lim yiv(R, 6, $), 


d—0 


where (F, 8, ¢) are spherical polar coordinates and before taking the limit, the toroidal 


coordinates (s, 7, ¢) are replaced by (R, 86, ¢). 
Since 
d(s’ — 1)'”” dsin n : 
p= : Z>7 eee (4.1) 
(S — COS 7) (§ — COS ») 


where (p, 2, ¢) are cylindrical polar coordinates, one obtains 


R° (p + 2°) (s + cos n) 
i e 2 7 (4.2) 
d d (s — cos 7) 
and 
- (s° — 1)'” : 
lan @ = ; . (4.3) 
sin 7 


Eliminating » from (4.2) and (4.3), one can obtain an expression for s in terms of R and 
6. Hence the following are obtained when d approaches zero 


> 1/4 a. 
(s — 1)" ~ Z2sin 0 
fi 
(4.4) 
d- = 
s~1-+ 53 2sin’ 0 
ki 
Similarly one obtains 
(§ — Cos ~ 2d°R . “4 

. (4.5) 


sin 7 ~ 2dR™" cos 6 


The asymptotic values given by (4.4) and (4.5) hold not only when R is fixed and d 
approaches zero, but when d is fixed and R approaches infinity. 
Using the above the following limits can be ealculated 


lim W7V(R, 6, >) H(R, 0, o) 
. TT 
<= © <€ 9? (4.6) 
lim Wov(R, 6,¢) = HR, 0, o) - 
where 
- . \ , ee 1)‘(kR)*' : . , . na 
H\(R, 60,¢ 2 sabes : > (1)! sin TP (cos @). (4.7) 


- ) 


For present purposes the limit is only required for the range 0 < 0 < 7 

5. Solutions of the wave equation representing radiations from a ring source. 
Having obtained the asymptotic values of ~"\(P) and W7\(P) we now find the linear 
combination y¥ representing a solution of the Helmholtz equation, satisfying the radia- 
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tion condition, and possessing a ring singularity (i.e. singular at the limiting surface 
s = o@), This is done by using the necessary, condition that yy represent radiation 
from a point source when d — 0. No further restriction is placed if it is required that 


lim yy = e'"*h\(kR)PX (cos 8), (5.1) 


d—+0 


where JN is an integer. 


Since 
hy’ (kR)PX (cos 0) = jy(kR)PX (cos 0) + i(—1)**'j_y_\(kR)P“y_, (cos 6) 
it can be seen that the desired linear combination has the form 
vy = Wy + i(-1)" 0%), 
where 
VY = > CAN, Myvr(P). (5.2) 
Thus it is seen that (5.1) can be replaced by 


lim Wy = e'"*j\(kR)Px (cos 6). (5.3) 


Using (4.6) and (5.2) this reduces to 


> CAN, IMH"(R, 0,0) = e'”*jy(kR)PX (cos 8) 0<@0< (5.4) 


Nia 


Equation (5.4) determines the unknown constants C,(N, 1/7). It is an identity in the 
variables (R, 6, ). In solving for C,(N, A/) we shall consider @ to be in the 
range 0 < @ < 2/2. This will be sufficient for present purposes. 

The right hand side of (5.4) is of the form 


I =) saat 
(x)! 2 . (—1) 2 


: P* (cos @)e'”®. (5.5) 
2 so (p)!l(p + N + 3/2) 


This is a power series in (/R) with lowest power N and with all terms in the power 
series of the same parity as V. Thus considering the expression for H” given by (4.7), 
it is obvious that the left hand side of (5.4) must be of the form 


ye Crro(N, M)HX.2(R, 0, >) (5.6) 


and on substitution of the expression given by (4.12) this becomes 


még M-N)/4p pn So (RR) <x Cus2(N, M)(—1)?" apr yp-u-ver Per 
2 (KR) »S 9” 2 "Te -++ D sin 0” 'Py., (cos 6). (5.7) 


- , é . “— ah 2 - — N in 
Substitute the expression given by (5.5) and (5.7) in (5.4), divide out (kR)* e'”°, and 
equate coefficients of (kR)*? to obtain 


1/2—M 7, p : r (—1 

(qr) ] y (cos @) — 9 M-—N)/2-p > ( vear(A ‘ M)-( 1) sin gr" — 

(p)!T'(p + N + 3/2)2**” = lip —rt+ 1) (5.8) 
x Px?" (cos 60), where p = 0,1, 2,3, 
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The constants Cy,.,(N, M) are determined from the infinite set of equations (5.8). 
To solve for Cy.2,(N, M) the following result is used 


Px™ (cos 6) . sin 6” "Py ™.”*" (cos 6) = 
Lit. . @ Len ze a aanavrar a ae (5.9) 
2*(p) !T(p + N + 3/2) 0 2 (p — M(NIT(N + 3/2 + 7) 


The proof of the above expression is found in [6]. 
Using the fact that 


Mu r(N + M + 1) p-™M 


T(N ooh M de 1) Ion (cos 6) (5.10) 


P* (cos 6) = (—1) 


and the identity given by (5.9) one immediately obtains 


awn r+M 
ee ee pee eS Lt eT 

r(N — M+ 1)(!IT(N + 3/2 +7) 2000?" , 
So now the coefficients C,(N, M) in (5.2) have been found such that Eq. (5.3) holds. 
Since the limiting process of d approaching zero is independent of whether y”.(P) or 
y’,.(P) are used in (5.2), one has two separate solutions for Wy , odd and even solutions 
in the variable n. Define the even solutions by SY and the odd solutions by 7% as follows: 


e'™ Sy = Do Crsa(N, M)iv+2(P), (5.12) 
TE = > Cysa(N, M)ine2(P). (5.13) 


Applying the necessary condition for continuity of the function and its derivatives as 


3), it is seen that in (5.12) the subscript N is such that 


given in Sec. (3), 
N=M+2l o -N-1=M+42l 
and the subscript N in (5.13) is such that 


Nw=M+2/+1 0 -N-1=M+4+21+1, 
where | = 0, 1, 2, 
S™ and T™ have the following properties 


. ’ . Vv 
lim Sy = jy(kR)Px (cos 6) 


M< A0<5 
lim fc = jy(kR)Px (cos @) im 
Hence the functions VY and W°¥ defined as follows, 
Vy = se + i(—1)**'S™- ) (5.14) 
We =Ty + 4(-1)°"T*- , 5.15) 
have the property 
lim Vv ho '(kR)P* (cos 0) 
i a Ot = 
f oe 9 


lim Wx h\?(kR)P* (cos 8) 
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Thus we have a set of functions e'"* VX (s, ») and e'"* W4(s, n) which satisfy the necessary 
condition for outgoing radiation. Their analytic properties of continuity convergence, 
singularities etc., must be investigated. 

The first thing that is required is to evaluate (5.12) and (5.13). From (5.12) and 
(5.11) it is seen that 








T(N + M + 1)()'*(-1)” < (—1)'y,2",,( ; 
T(N a. M 4. 1)! ™ +788 > 2"(r)!T(N 4 a 2. r) ’ (5.16) 


ieee = = 
where by (3.8), W7y.2,(P) is obtained from (2.24) by replacing 7 by (V + M)/2+ Pr 
and multiplying the resulting series by (kd)***". 
The expression given by (5.16) can be simplified and the following is obtained 





</ AT 1/2 M 2p "K 1 
a r(N + M ee 1) ae. - ) d)""(— \-e, 2 (5.17) 
T(N — M +1)2 (s — cos n) poo (8 — Cos n)”2 
where 
(48 -- N) (x +M+ 1) ne 
; en —](s* — 1) 

wt ow — >» E _ 2 a < ae as 

. (p)IT(N + 3/2 + p) = (r Ns — cos 7)" (5.18) 


ex ure w)/2 (8). 


Since N is specified for S¥ such that N = M + 2lor — M — 2] — 1, and Lis a positive 





integer or zero, the series in (5.18) terminates when r = l. 
Similarly one can reduce the expression for 7X to the following 
‘ rN + M + 1) is *(—1)™ sin n(k d)” (k d)?(—1) ‘Ky . 
(N - VW ot 1)2 ew “(s — Cos n) incl tichi p=0 2”(s -—- Ce n)” 
where 
M-—wN 1\(M+N r/2 
, Cee tee 
2 ns ~ >- : Soest — 5.20 
K; (p)!l(p + 3/2 + N) 2d, (r) (s — cos n)’ oan 


M 
" of M+N i 2(s). 


Since N is specified for 7’ to have the values N = M + 21+ 1 or — M — 21 — 2, the 
series in (5.20) terminates when r = 1. 

Hence we see that SY and 7% are comprised of two series one finite and the other 
infinite. Hence for convergence, we only need to consider summation over p. Using 
the inequality (2.28) an absolutely convergent dominant series can be found for the 


series 


a ___(kd)”*P SN eM) / o(8) 


2°(s — cos n)*(p)!T(N + 3/2 + p) 

Hence it can be shown that the series expression in (5.17) is absolutely and uni- 
formly convergent for the region 1 < s < ~,0 < 4» < 27and0 <|kd| < o~. If the 
series (5.17) is differentiated term by term by either s or 7, the resulting series is also 
uniformly convergent for the same region. 

The same remarks hold true for the series expansion (5.19) given for TY . 
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6. Integral representation of V;;,.,(s, 7) and W%),..4:(s, »). Before the asymptotic 
values for large R can be obtained, we must obtain an integral representation for each 
of the functions. 

Using the following integral expression for the associated Legendre function P*(s) 


where up < 3 
u 2s — 1)" : ? 1/2 n+ms-: > . 
P;(8) \172n/1 | [s + (s' — 1) cos ¢t]"** (sin t)™ dt (6.1) 
w) “T'(s — p) Jo 
we obtain 
(3? — 1)'"P-*~"(s) “se = ih" [ Z’ (sin t)’” dt (6.2) 
eee rk 
8 — COS 7 (x)' “TG + M+pr) Jo [s+ (s? = 1)’ cos t}" 
where 
(s* — 1) (sin ¢)” - 
= ———__““_ ae . (6.3) 
2(s — cos n)|s + (s' — 1)” cos t] 
From (5.18) and (6.2) we thus obtain 
v1 2-5? — 1)" (x)"'” 
| = qr T ! ‘ ) J V1 
(pIT(N + 3/2 + p)T(4 + M) 
,(M-—-N N M l : : (6.4) 
(5 es ae = -~:;24 M-Z)} (sin 0)” dt 
J [s + (8° — 1)” cos t]"~-?” 


The range of Z isO < Z < 1, Z being unity when cos 7 = 1 ands + (s* — 1)'” cost = 1. 
Since we are considering the case where N = M + 2/ or — N — 1 = M + 21 the hyper- 
geometric function in the expression (6.4) is a polynomial with finite argument. Thus 
the interchange of order of summation and integration is valid. Now in the expression 
for S (5.17) substitute expression (6.4) for K}, . Interchange the order of summation 
and integration. Noting that 


ee 4 = = - (6.5 
jv(X) » DITA +3/2+p) 2 — 
ro 2/1. Vo V/2-1 nef PLE 2p 8 (9? — 1/2 ., s p+N/2 
_ @ (k d)°2 i a # k d)””| = & J) a8 ty!" (6.6) 
(s — cosn)*”? p=-0 2(8 — COS 7) (p) tI (N + 3/2 + P) 
where 
XY = kd 9! 2} 46 + (s° — i? , cos ft “ (6 7) 
; 8— COB” an 


it is seen that the infinite series in the integrand is uniformly convergent for © > s > 1 
and 0 < ¢ < x. Hence the following holds 


r(N —-M+1)ra+M) 


[" (¥ =N N+M+1 44 y, z) 


J0 ~ ~ / 


r(N + M + 1)(-2)"@)"” 


M 
Sy = 


(6.8) 


*jy(X)Z"” (sin 1)” dt L<2< me. 
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When N = M + 21 the integral in (6.8) will hold also for s = ©. From = we have 


_ (2M + 2l) 2)" (x) ft wri ks 1.3 -Z)hD 
Vite = oni eM) I, CUM +L A+ Mi DRX) 6g) 
x Z™” (sin t)” dt, 


where h\/’(X) is the spherical Hankel function of the first kind. In a similar manner the 
mM 


following integral expression may be obtained for Ty where N = M + 21 + 1 
o —-N—-1=M+421+4+1, 
1/2 


ry — TN + M + 1)(—2)™™ sin nfs’ — 1)°'? 
ion (N —-M+ Rae + 3) 


so ae (6.10) 
° | (Mout M+WN + i: 1 + M; 2) 


2 ; 2 
te!" Gin * a. 


The integral expression in (6.10) holds in the ranges 1 < s < © forN = M+21+1 
and 1 <s < @ for N = — M — 2l — 2. We also obtain the integral expression for 
W*t+21+1 Which holds for 1 < s < @ 


/2 


We _ (2M + 2l + 1) —* Zz ae n(s' — 1)” 
M+2i+1 — (21+ 1 1/47 ( 24 M) 


a* 


| F(—l, M + 1+ 3/2; 3 + M; Z)Rworas(X)Z"*?” (sin 2)" dt. 





(6.11) 


7. Asymptotic values of V ¥/,,,(s, 7) and W¥..,.,(s, 7) for R > ©. Having obtained 
the integral formulation of the wave functions, we are now in a position to investigate 
their asymptotic values when R approaches ~. 

From (4.4) and (4.5) the following asymptotic values are obtained for R approaching 


(s*— 1)'” d ‘ l 
~ Is — 7 
: R | 2 sin 6+ o( 2) | : (7.1) 
S— cos 2d° 43) i = 
; R? | ‘? 0( 7 | (7.2) 


Hence the asymptotic values of Z and X for large R are 


Z ~sin’ tsin® 6+ 0( ') (7.3) 
and 
X ~ kR + kd cos tsin 6 + 0(4)- (7.4) 


Insert the values (7.3) and (7.4) into (6.9) and using the fact that when R approaches © 


aad u+or+1 exp (tkR + tkd cos tsin 6) 
han+2(X) ~ (1) kR 


we have 


Ve wing i R™ ,., (cos 8), (7.6) 
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where 


_ (2M + 21) "—1) "(9)"? 


Rit+21 (cos 8) — 
(21) '(2)" T(M + .5) 


; (7.7) 
x [ Fi(—-l, M+ 1+ 4;4 + M;sin ? sin 6’) sin 6” sin ?™ e'**""'*'"" At. 
Expand the hypergeometric series in (7.7) and use the relation 
Tin + 4)J,(2) = 9! ‘(2) [ rr" aint” at (7.8) 
to obtain 
Rv .21 (cos 8) = ise > =z ert + 4). (? sm Vy uw+(kd sin 6). (7.9) 


To obtain the asymptotic expansion for W4..,.,(s, 7) the following result is needed 
[from (4.3)] 
sin n(s’ — 1)°'’? = cot 80. (7.10) 
In a manner similar to the above, the asymptotic value of W{f,.,,,(s, ) can be obtained 
for R approaching « 


tikR 
W wevarei(8, 9) ~ (—1)"*7**? LR Rivso1+, (cos 6), (7.11) 
where 
(2) ! cos — (—1),( 3/2 si 
er { + 21 + 1)! cos 0 >! —1) (M + 1+ 3/2), (28in ‘) 


kd 
-J .,(kd sin 6). 


(21 + 1)!(—kd)™ oo (r)! 


From (7.6) and (7.11) it is seen that the wave functions e'"® V{..,(s, ») and 
imo [77M . — er aes ’ : 
e*”* Wirso141 (8, ») Will satisfy the radiation condition. We shall consider the case where 


d approaches zero. It can be shown that the asymptotic values for Z and X given by 
(7.3) and (7.4) will hold when d vanishes. Hence we have 


lim ie 2 AY... (kR) lim R™ ., (cos 6). (7.13) 
But 
2M + 21)"—1)" 
lim R™ .., (cos 6 oh ’ 
40 . (21) '(M)! 
(7.14) 
sin 6g . . ss aS 
\ 2 of (—l, M+1+ 3; M + 1;sin’ 6) 
\ 
2M + 21)"—1)”" 1 
: Py.2, (cos 6) — 
a un (7.15) 
: p™ 21 (cos 6) 0 < 6 < r. 
We see that R¥.,., (cos 8) is identical with the associated Legendre function P™,,, 


(cos 8) when d vanishes. In a similar manner it can be shown that R¥y,.,,, (cos 0) becomes 
e ° vu ° ° . ° 
identical to Pyy.2,.; (cos 8) when d is zero. Hence the toroidal wave functions 
ime 17M mo 117M . . . . . 
e*”* Vivs02(8, 7) and ¢ W x +21+1(8, 7) are identical to the spherical polar wave functions 
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when d vanishes. Thus when d vanishes they represent radiations from a point source. 
The remaining detail to be considered is the nature of the singularities of the wave 


function ats = ~, 

8. Asymptotic behaviour of V¥;,.,(s, 7), Witso1.,(8, ») a 8 > @. In order to in- 
vestigate the asymptotic behaviour of Vxy,.,(s, 7) and Wi40:.,(s, ») we must investigate 
the functions Sx.2: , S“w—o-1 , T4214, and Ty_,,_. separately. 

For Sis, and T¥/,.,., use the integral expressions. We have shown that the ex- 
pressions hold for 1 < s < @ and since the integrand is finite for s = © we may use 
the integral expression to determine the asymptotic values. 

'rom (6.3) and (6.7) the asymptotic values of Z and X become as s approaches ~ 


Z~2°‘(1 — cos 2), (8.1) 
X ~ kd 2'°(1 + cos @)'”’. (8.2) 
Hence we obtain the following for s approaching © 
Si .o. ~ constant, (8.3) 
einen — constant. (8.4) 


It is more difficult to find the asymptotic values for S”y,_.;-, and Ty_»;-2 . For simplifi- 
cation of work, the functions F,,(s, 9; 1, M) and G,(s, 7; l, — are defined as follows 





s — cos )'*! 2 pee ) (MM + l + l+h, (s” — 1)” -—~M-r (s) 





Fs, 0; 1, M ZT o(r)\(s — cos n)’ sitet 
(—)(M + 1+ 3/2), aes 
‘- sal — s n)'t!/2 " O/4 
G,(s, 9; 1, M) = sin nf: cos 7) } (r) > wea 
-(3?> — 1)'?P> =r (s)| 
Thus from (5.17), (5.18) and (8.5) it is seen that 
gM , r(2M = 21 + 1)(m)'*(kd)-“~?*-" 
—a iS, 1) = 1+1 2 — 
(2 a 
21 + 1)2° 1) (8.6) 
e (—1)’(kd)”’F,(s, n; l, M) 
= (p) 12° = = # oe p - +)(s — cos n) )? 
and from (5.19) and (5.20) 
ra T(2M + 21 + 2)(x)'"(kd) “~?'~? 
u—21-2\8, 7 1+1/2 ul 
“(92 9)9 2¢ 
r(2l + 2)2 (—1) (8.7) 
Bs (—1)’(kd)”’G, ;1,M) __ 
& (p)!2°1(— M “Ts 1)(s — cos n)” 


Now multiply S“y_.;-.(s, 9) by (kd)"**'** e'"* and let k approach zero. The only term 


which is non-vanishing is the term involving F,(s, 7; 1, 11). Thus we have the following 


{(]J\M+2l+1 (imo oM ) 
{ (kd) é Ov 21-1$k=0 


T(2M + 21 + 1)(x)'2'-"”*(—1)™ e'™ |, ; (8.8) 
r(2l + DI(—M — 214+ 4 F,(s, n; 1, M). 
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But any solution of the wave equation when k = 0 is a solution of the Laplace equation. 
Thus e'”® Fy(s, 7; 1, /) is a solution of the Laplace equation. Now Fo(s, n; 1, M) is a power 
series of (s — cos 7) up to the /th power, multiplied by (s — cos )'’* and is non-singular 
when s = 1. Since the complete set of solutions of the Laplace equation which are non- 


singular at s = 1 are 


e'™* (s — cos n)'?P.",,2(s) cos nn | 
and ’ aval 1| (8.9) 
‘™? (s — cos n)'’P,™.,.(s) sin nn) 
then 
F (8s, 9; l, M) = Z a.(s — cos n)'’’P, ‘/2(8) COS rn. (8.10) 
By a similar analysis it is seen that 
i+1 
G(s, 931, M) = >> bs — cos n)'?P-\,.(s) sin rn. (8.11) 


r=1 


So the problem is to calculate a, and b, . To find the coefficients a, multiply both sides 


of Eq. (8.10) by (s? — 1)~”” and let s approach 1, using the fact that when s approaches 1 
9 af ee 1)" 
s° — ]) . "P. . (Ss) =. or 
ri-+M-+pr) 
we obtain the equation 
ye a, cosrn = (1 — cosy). 8.12) 
But on the expansion of (1 — cos 7)’ in a Fourier series we see that 
(21) !(—1) 
€\4 $ 
a, = 8.13) 
(l—nrni(l+yr)!2 
where 
e-=1 when r= 1,2,3,-:-- 
8.14) 
€ + when r = 0. 
Now in a similar manner Eq. (8.11) may be reduced to the following 
t+1 
sin 7(1 — cos n) = >. b, sin rn. (8.15) 
1 


Thus the coefficients b, are derived from a Fourier analysis and 6, are 


(9 a 
“ r(2/ + 1)'(-—1) (8.16) 


I+i¢t+nvil+1—nvni2- 
Thus we obtain the following expressions for F,(s, n; 1, 17) and G,(s, n; 1, 7) 
e,(2)) (—1) uM 


F.(s, n; 1, M) = (s — cos n)'” > P.-\/2(8) cos rn, (8.17) 
r=0(l—rnil+r!2- 


l 


ees. (21 v1" 
G(s, n; is M) = (g — COs n)' 2 p r( 1) ) 
mi(l+1+n\"(l+1—n)!2 


P-™,.(s) sinrn. (8.18) 


i-1 
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Using the following asymptotic values for the associated Legendre function 


mens 2) 1/2 1/2 ; s = ) 
f 1 2(8) (2 T ( i + 1) | { log, (2s) Di ¥( M + 5)} 
tei y = 0.5772156649--- and ye) = aE. re) (8.19) 
pips 2°T(n + 3)8" _4 
PO~O71¢nt+mM "7? 73 


we see that the values of F'o(s, 9; 1, ) and G,(s, n; l, M) become the following when s 
approaches « 


at) 


l 


F,(s, n; l, M) s\? P77 .(s) cos In , (8.20) 


l 
G,(s, n; l, M) ~ < ue - pom o(s) sin (1 + 1)n. (8.21) 


It can be shown [6] that 
(s — cos n) ’F,(s, 9; l, M) < O(s'~') p = 1, 2,3, 

and hence the dominant term in expression (8.6) given for S"y_.,-; is Fo(s, 9; 1, M) for 
s > 1 provided that kd < s. Similarly it can be shown that G,(s, n; 1, M) is the dominant 
term in expression (8.7) given for Ty_)-, . 

Thus from (8.6) and (8.20) we have when s nga co 

(2M + 21)\(2 —1)**! a 

+ eer 5 —<\— s\*P7™ (8) cosln —_— (8.22) 

~ 2D IT(— M — 2l “e 5) (kd) 


M 


S—w-21-1(8, 





and from (5.14), (8.22) and (8.3) we have 
i(—1)'""(2n)'"(2M + 21's 
(kd)“*?***(21I 'r(—M — 21 + 4) 
From (8.7) and (8.21) we have 
s+ ye’ hii 
(2+ IIT(-M — 21 — 1)(kd)™ 


V w+21(8, 9) ~ > (8) cos In. (8.23) 


- PrN /2(s) sin (l + 1)n (8.24) 





mM — 
1 u—21-2\8, 7) ™ 


and hence 


. lve 1/2,1/2 
Ww... ..(8, 7) ~ i(—1) (2M + 21 + 1)'@r _ aa ™ (s) sin (+ I)2. (8.25) 
(21 + 1)!"1(—M — 21 — 4)(kd) ; 

9. Orthogonality and general discussion. We have obtained a set of solutions of 
the Helmholtz equation in toroidal coordinates which satisfy the radiation condition, 
are continuous and convergent in all space and possess a ring singularity. When the 
radius d of the ring described by the coordinate s = © approaches zero, the toroidal 
wave functions become identical with spherical polar wave functions. Hence the toroidal 
wave functions e'"* V}.,.,(s, 7) and e'"*® W4,.,.,(s, ») form a complete set. That is, any 
solution which is continuous and single-valued outside the torus s = sg (i.e. region 
1 < s < 8s) and satisfies the radiation condition and arbitrary boundary conditions on 
the torus can be represented by a linear combination of the above toroidal wave functions. 
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Now the question of orthogonality arises. Are the functions e'"*® Vy;,2,(s, 7) and 
ime We .o141(8, 7), orthogonal over the surface of every torus? 


e 
If we choose a weight function p(s, 7) which is an even function of the variable 7, 


then we obtain the following 


[ [te Vals, Die" Vi, Ip, 1) dndp=0, M#M’ (9.1) 


M+2 
2 a2 ; ; 
[ | fe Wooa(s, 0) fe * We .o1-1(8, 2) p(s, n) dndd = 0, M’ XM (9.2) 
#0 70 


| [e'"*V is00(s, ) le” °W ag s2141(8, 0) p(s, 0) dn dd = O, (9.3) 


. What can be said about the integrals? 


which holds for every torus s = constant 


[e aad er | n) J fe we eer n) |p(s, n) dn do, l#l’ (9.4) 


a2 a2 
\ 


| | le “yy. o1+1(8, n) le” W i o1+1(8, 0) |p(s, n) dn dd, ls’. (9.5) 


It can be shown that if p = h, , then the integrals given by (9.4) and (9.5) do not 


vanish. In fact for p = h, , it can be shown that there is no set of linear combinations of 


* . >M ie ul ‘ : 
the wave functions « VY wor OH « Wrs214, Which form a complete orthogonal set 


over the surface of every torus. For p # h, nothing at present can be said about the 


vanishing of the integrals given by (9.4) and (9.5). 


. ¢ . im@ Js imé 

As it stands now one can say that, the wave functions « Visreor(8, 7) and e’”® 
Wr 8, form a partially orthogonal set over every torus. To complete the orthog- 
Mu | ; : ; 5 
onal set, one must at present use the Hilbert-Schmidt process for every torus s So . 


= gs, and each value of m one must form an orthogonal set. from 


That is, for each torus s 
™o Virwe(So, m), €'”"* Vireal(So, 7), °** using the Hilbert- 


the functions e'”® V}i(s» , 7), ¢ 
Schmidt process, and also form an orthogonal set from the functions e'"® HWy,,,(89 , ), 
er" Ww. (8) , n), -:: . However the functions do form a complete orthogonal set over 
the surface of the limiting torus s = s, where s) > 1. 

Apart from the difficulty involved in the problem of incomplete orthogonality 
4 property of non-separability, the toroidal wave functions derived in this 


Vv 


which is ¢ 
paper can be used to solve any problem of diffraction of acoustic waves by a torus or 
of a radiating torus. In fact they are useful in solving the vector wave equation, and 
already have been of practical value in electromagnetic problems. 
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BOOK REVIEWS 


(Continued from p. 236) 


Neutron transport theory. By B. Davison. With the collaboration of J. B. Sykes. Oxford 
University Press, New York, 1957. xx 450 pp. $12.00. 


This book has for its central theme the study of the Boltzmann equation for neutron transport, 
the fundamental equation that describes the migration of neutrons through material media. It is evi- 
dent that this study is of principal importance in the design of nuclear reactors. It is also of direct interest 
in astrophysics in the consideration of problems of radiative transfer which are very similar to those of 
neutron transport theory 

In this monograph the author has set for himself the task of giving a thorough review of all the 
important mathematical methods used in neutron transport theory. The subject matter is developed 
from first principles, and so a previous familiarity with the theory is not strictly necessary. Neverthe- 
less, this book is not meant to serve as a first introduction to the subject. It is an advanced work which 
assumes a fair degree of mathematical maturity on the part of the reader; it requires comparatively little, 
however, in the way of a knowledge of nuclear physics and quantum mechanics. 

The material covered in this work is conveniently divided into four parts, The first part Is devoted 
to a precise formulation of the laws of neutron transport, the derivation of the basic integral and integro- 
differential equations, and the relation of stationary and time-dependent problems. The second part, 
by far the largest in the book, is concerned with the methods of solution of the transport equation in 
the constant cross-section ipproximation, i.e., for the case of one-group theory. Apart from a consider- 
ation of the few situations which can be treated exactly, this section develops in considerable detail 
the various approximation methods that have been proposed to date; the spherical harmonics method, 
in particular, is discussed at great length. The remainder of the book is concerned with energy-depend- 
ent problems for the cases of spectrum-regenerating media (Part II) and slowing-down media (Part IV). 

This book fills an important gap in the literature, tying together, as it does, a considerable amount 
of material which had previously been available only in the form of technical reports and periodical 
articles. It is a serious mathematical work which is very well organized and very well written. It should 
prove extremely satisf\ ing to theoretical phy sicists and applied mathematicians who are interested 
in an up-to-date 
discussion of neutron diffusion theory as it is usually presented in textbooks on nuclear reactors 


iwecount of the problems of neutron transport which goes far bevond the simplified 


While one may question the nature ol certain limitations on the scope of the subject which were 


imposed by the author thus, the discussion is limited throughout to homogeneous media, with little 
or no reference being made to experimental methods or results), there is no doubt but that these have 


made possible a very well-knit and unified presentation; in any case, the book is already of considerable 


length. However, it would have been useful, especial] ina comprehensive treatise such as this, if the 





author had documented this work much more extensively. The book is highly recommended 


Davip Fe_pMAN 


Elementary theory of angular momentum. By M. I. Rose. John Wiley & Sons, Inc., New 

York, 1957. 248 pp. $10.00. 

The quantum theory of angular momentum occupies a central position in present-day atomic and 
nuclear physi s. Evidently, in the des ription ol complicated systems, it is important to be able to 
separate out those aspects which can be traced directly to the existence in nature of certain fundamental 
svmmetries from those which depend upon the detailed characteristics of the systems themselves (the 
specific shape of the nuclear force, for example). Notwithstanding the recent experimental verification 
that the so-called weak interactions are not invariant under spatial reflections, it still appears that 
absolute symmetry principle; hence, angular momentum is always conserved. 


rotational invariance is an 
The book under review 
Oak Ridge National Laboratory. As is perhaps evident from the title, it is a textbook and not a treatise. 


is an outgrowth of a series of lectures given recently by the author at the 


(Continued on Dp. 306) 


VARIATIONAL PRINCIPLES FOR GUIDED ELECTROMAGNETIC 
WAVES IN ANISOTROPIC MATERIALS 


BY 
WALTER HAUSER 
Lincoln Laboratory, M.I.T., Lexington, Mass. 


Abstract. It is the purpose of this paper to develop a general method for obtaining 
approximate solutions to the problems of the propagation of waves in a guide which 
may only be partially filled with material having tensor electromagnetic properties. 
With the introduction of an appropriate dyadic Green’s function we are able to obtain 
a formal solution to the problem in terms of an integral involving the field vectors in 
the perturbing rod. The reformulation of the original problem in terms of an integral 
equation enables us to construct a variational principle whose extremal value is insen- 
sitive to a great range of trial functions. The extremal value of the variational expression 
from which the integral equations for the field are derivable is shown to be proportional 
to (y, — y), the difference between the square of the propagation constant of the wave 
in the empty and loaded guide. Normal modes in terms of which an expansion of the 
dyadic Green’s function is obtained are defined. In the last section we demonstrate the 
ability of the variational method to improve the results obtained by perturbation 
methods obtaining a first order approximation for the propagation constant of a wave 
in a rectangular guide containing an infinite ferrite slab. 

Introduction. With the use of ferrites in the design of microwave gyrators, circulators 
and nonreciprocal transmission systems, one has become interested in the problem of 
cavities and waveguides containing materials having tensor electromagnetic properties. 
In general, such a problem is extremely difficult and tedious. One is almost impelled to 
look for approximate solutions. In those problems where the material fills but a small 
part of the cavity or guide, or where the electromagnetic properties of the perturbing 
material differ but slightly from those of the empty guide or cavity, perturbation theory 
has been somewhat successful in obtaining first order approximations [1]. There exist 
a number of problems, however, to which perturbation theory yields poor or no results. 

This paper is the first of a group of papers in which we shall concern ourselves with 
the development and application of a general method for obtaining approximate solutions 
to the problems of waveguides and cavities containing materials with tensor electro- 
magnetic properties. The method is an extension of the work of Schwinger [2] on the 
problem of isotropic obstacles in cavities. With the introduction of an appropriate 
dyadic Green’s function we are able to obtain a formal solution to the problem in terms 
of integrals involving the field vectors in the perturbing material. While the resulting 
integral equations are not any easiér to solve, there exists the advantage of being able 
to construct stationary expressions for the quantities of interest from which the integral 
equations for the fields within the material are derivable. Consequently, we have a 
very powerful method for obtaining approximate solutions for these quantities. In 
this paper, for example, we concern ourselves with the problem of the propagation of 
waves in a guide which may only be partially filled with a rod of uniform cross-section. 
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In this case, the first order variation of (yj — y°), the difference between the square of 
the propagation constant of the wave in the empty and loaded guide, is zero with respect 
to similar variations of the fields in the rod. Berk [3] has treated these problems obtaining 
variational principles which yield the differential equations satisfied by the electro- 
magnetic field. His expressions, however, are not applicable to lossy media in general, 
and furthermore require a knowledge of the fields everywhere within the guide. Our 
method in addition provides us with a systematic way of improving the trial wave 
function, thus permitting us to improve upon the results obtained by perturbation 
formulae. 

While the full use of the variational method consists of the substitution of trial 
functions with unknown variational parameters followed by the calculation of the sta- 
tionary quantity, we may at times obtain good first order results by simply substituting 
completely determined trial functions. In such cases the variational expression reduces 
essentially to a perturbation formula [4]. It has the advantage, however, of furnishing 
the best amplitude of the trial field within the perturbing rod. 

I. Integral equations for the electromagnetic field. The problem of a waveguide 
partially or completely filled by a rod of uniform cross-section permits separation with 
respect to the coordinate of the guide axis which we choose as the z-axis. Assuming an 
exp (jwt) time dependence, we can write the electric and magnetic fields as 


E(x, y) exp [j(wt + yz)] 
and 
H(z, y) exp [j(wt + yé)] 
respectively, where y, the propagation constant, is in general a complex quantity. 


In terms of E(x, y) and H(z, y), the z independent parts of the electromagnetic field, 


Maxwell’s equations in a region free of charges and currents reduce to 
(V. + jyvk) X E = —jou-H, (1) 
(V. + nk) X H = joe-E. 


e is the tensor electric permittivity, and u is the tensor magnetic permeability of 
the medium filling the guide. In the usual manner, by taking the curl of Maxwell's 
equations one obtains the wave equations satisfied by E and H. We find 


d Xd X E — weuoE = w u(e’-E) — jod X (y’-H), (2a) 
and 
d Xd XH — w’e.u.H = w,(u’-H) + jod X (e’-E). (2b) 
We have set 
d= V. + nk, 


e=el-+e’, 
/ 
v=ph~!l +’, 
where I is the unit dyadic, and ¢€, and yp are scalars whose choice depends on the problem 


at hand. 
From the form of Maxwell’s equations, we realize that the boundary conditions on 
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E, H, B, and D are not affected by the introduction of tensor e and w. Across a surface 
where e or uw or both change discontinuously, the tangential components of E and H, 
and the normal components of B and D are continuous. At a conducting surface we 
shall require the tangential component of E and the normal component of B to be zero. 

At this point we should also like to introduce the adjoint fields, E* and H*, which 

we take as the solutions of the equations 
d* X E* = jwH’-u = jwB’, 
d° X H’ = —jwE’-e = —pD"’, 
where d° = VY, — yk. 

In the case of lossless media where e and pw are hermitian [5], E* = E* and H* = H*. 
In the case of lossy media, the adjoint fields may also be simply related to the fields 
E and H. For example in the case of lossy ferrites, taking Lax’s form of the magnetic 
permeability tensor [4], having its diagonal elements even in w and its off diagonal 
elements odd in w, 

H(x, y, y,w) = H(z, y, — y, — a), 
and 
E(x, y, y,#) = E(x, y, — y, — o). 

Consider now the problem of a waveguide of cross-section (S,; + S,.) with region S, 
occupied by a material having tensor electromagnetic properties. We are confronted 
with the problem of solving Eqs. (1) within the two regions subject to the boundary 
conditions stated above. To do so, we* employ the help of the dyadic Green’s function 
which satisfies the differential equation 


d® x d® X Z°(x|x’) — w’€ouoZ"(x|x’) = 15(x — x’) (3) 
everywhere in the guide, where x = zi + yj. 
The Dirac delta function 6(x — x’) = 6(x — x’)é(y — y’). We further require Z* 


to satisfy certain boundary conditions at the boundary of the guide. The Green’s function 
N“(x'x’), which satisfies the condition 
n X [N‘(x!|x’)] = 0 

when x lies on the boundary of the guide, we refer to as the electric dyadic Green’s 
function. , 

Analogously the dyadic Green’s function, M*(x'x’), which satisfies the condition 

n X [d* x M*(x!x’)] = 0, 

when x lies on the boundary of the guide, we call the magnetic dyadic Green’s function. 

For the solution of the fields in problems where either E or H is divergenceless, we 
further require the respective dyadic Green’s function to satisfy the condition 

d°-Z*(x|x’) = 0. 
We now rewrite Eq. (3) in the form 
V. xX @ X ND — nk X @ X ND — KONE = ad(x — x’), 


where we have set kj = w’€uo and N2 = N*-a. @ is an arbitrary vector introduced in 


*We follow closely the work of J. Schwinger [2] on the problem of obstacles in cavities. 
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order to simplify the handling of the dyadic Green’s function. Taking the dot product 
of Eq. (2a) with N’ , the above equation with E and subtracting, we find that 


E-aé(x — x’) = V,:(N2 X d X E) —EX @’ X N) + joN? X (v’-H)] 
+ w'po(e’-E)-N2 — jw(d’ X N2)-(u’-H). 


Utilization of Maxwell’s equations and the boundary conditions satisfied by the tan- 


gential components of E and H, leads to* 


E(x’)-a@ = w fo | N i (x|x’)-e’- E(x) dS — jw [\d° & Ni(x'x’)]-[w’-H(x)] dS. (4) 


J Vv, - P(z, y)drdy = f n: Pds. 
The magnetic field may be computed directly from the previous equation through the 
use of Eq. (1), or it may be obtained by utilizing iq. (2b) and the magnetic dyadic 
Green’s function. By repeating the steps which led to Eq. (4), we obtain 


, 


H(x’)-a@ = we | M?:(u’-H) dS + jw | (d° xX M °) -(e’-E) ds. (5) 


d “ 


With the help of the reciprocity relations satisfied by the Green’s functions it is readily 
shown that the two methods for obtaining H are equivalent [6]. 
Utilizing the two dimensional Green’s second identity 


| [A°-(d X d X B) — B-(d’ Xd’ xX AD] dS 


= n-[B xX @" X A) - A’ X XB) ds, (6) 


d 


we find upon setting 


AY = ZU(zix) 
and 
B Z3(x|x’’) 
that 
Zix’’ |x’ eS (x ix") 
where Z is the transpose of Z. Similarly, setting 
=f X Mizz’ 
and 


we obtai 
d° & M*(x’’ |x’ dX N(x’|x”’ 


II. The normal mode expansion of the dyadic Green’s function. Our next task is 


*We have used the two dimensional Green’s theorem 
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that of finding an explicit expression for the dyadic Green’s functions. Such an expression 
can, of course, take the form of an expansion in terms of the complete set of vector 


eigenfunctions satisfying the homogeneous equation 
d® Xd" X Al — (eh + PAL = 0, (7) 
the boundary condition 
nxA.=0 
for the electric modes, and the boundary condition 
n <.(d" < A‘) = 0 
for the magnetic modes. 
We can readily find such a set of vector eigenfunctions in terms of the usual normal 
modes of the guide. If F., and H,, are the z independent parts of the z components of 
the usual TM and TE modes of the guide, we find that the vectors 


ss —k x V. Hin 
ro (8) 


| = =i — y, — ele, 
_LYn — @ €nMo Y 


EAL, ¥Y) = EX(z, y), T.(2, 9) = Hl, 9), 


F(z, YY, Yn > @) = 


where 


and 


| 2 — ae k x Vv; = 
Yn — @ €oMo 


FAC, Y, 7; Yn »@) = 
| -,—y,-* « |e. 
Yn — W Eobo Y 


satisfy Eq. (6) and the boundary conditions which have to be satisfied by the electric 


modes. In addition they are divergenceless, 
d°-F; = 0. 

For completeness, we need another set of vector eigenfunctions whose curl is zero, 
d* <X f, = 0. 


Such a function, which is the gradient of a sealar, 


f d’¢,, , 
will satisfy leq. (6) for 

kK. —y 
and the bound ry condition n X f 0, if d = Oon the boundary. In order however, 
for the functions f* to form an orthogonal set among themselves, (they are automatically 
orthogonal to the functions F,), we choose for the set of scalar functions ¢° , the set 


satisfying the equation 


(d°-d° + w,)¢, = 0. 
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Analogously we obtain the magnetic vector eigenfunctions, the divergenceless vector 
eigenfunctions, 
| —f¥. Y 
| ae Vi- ye se » 
/ ~ Rae 0/0 
Gi(z, Ys 2, Vs ny w) _ 4 
— Jweo a 
kX VED, 


Va ied W Epo 


(9) 


and the curl-less eigenfunctions, 
a a a 
= d*y;, ’ 
with n-d’y2 = 0 on the boundary. The functions g?* are orthogonal to each other and 
the functions G, if the scalar functions y% satisfy the equation 


(d°-d° + ws’)y, = 0. 


We now demonstrate the orthogonality of the vector eigenfunctions as expressed by 
[Fe PidS = Abn, (a) 
[t-t2ds = 0%8,., () 


[ £.-F2as = 0, (c) 


[c-: dS = »%6,,, () (10) 


| ee: dS = nibam, (€) 


/ G,-g2 dS = 0. (f) 
Equations (a), (c), (d) and (f) are readily obtainable through the use of Eq. (6), whose 


right hand side vanishes if A and B are both either electric or magnetic vector eigen- 
functions. From Eq. (7) it then follows that 


(2 — x3) [ AS-B, dS = 0. 
Equations (b) and (e) can be shown to hold since for 6, being any scalar function 
[ @0.)-(a°e as = [ V.-(0.d°6) dS — | 6,(4°-4°0%) dS 


= [ v.46.) 62 dS ae il 65d -dé,, ds. 


If now 6, satisfies the equation 


d°-d°e. + w.62 = 0 
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and the boundary condition satisfied either by ¢, or y, , we have that 


Il 


| d0,,-d°0, dS = w? / 0,02 dS 


nO, aS. 


Il 
£ 
Zw 
—, 
> 


Thus if w2 * wZ the left hand side must be zero. We assume that degenerate eigen- 
functions have been orthogonalized. It is readily shown that the divergenceless electric 
and magnetic vector eigenfunctions are related through the equations 


qdxF,= — jwpoG,c, (11) 
and 
d x G,, aed jweoF 8, ’ 
where 
A for TE modes 
— Yn 
2 :. 2 
Yn me for TM modes 
é 0 
and 
2 ft 
Yn a7 for TE modes 
6. = Y 0 
«3 for TM modes. 
Yn 


The functions F, and f, together form a complete set of orthogonal vector eigenfunctions 
which may be used to represent any vector field having zero tangential component on 
the boundary of the guide. Similarly the set of orthogonal vector eigenfunctions G, 
and g, may be used to represent any vector field satisfying the same boundary conditions 
as those imposed upon the magnetic vector eigenfunctions. We should thus be able to 
write 


Ni(x|x’) = Do a.Fi + > bf: , (12) 


and 


II 


M:(x|x’) = >> a/Gi + >> big’ . (13) 


Realizing that the normal modes satisfy Eq. (2) with e’ and pw’ equal to zero and 
k2 = w €yuo replaced by (x2 + 7’), we find 


A.(x/)-0 = (2 +77 — 3) [ A@)-Zilx’) as 


as a solution of Eq. (7). Inserting expansions (12) and (13) for the dyadic Green’s func- 
tions into this integral equation, and utilizing the orthogonality relations (10), we find 
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that 
F(x’) -a@ 
a, = 5 ° 
T — GIA 
f —f,(x’)-a 
‘ keQ? 
. G,,(x’)-a 
a, 5 \\2 
\y a )A; 
and 
(X’)-@ 
b/ = = aid 
kom 


We have thus found the expansion of the dyadic Green’s function in terms of the 


‘normal modes, 


, F(x) F,,(x’) | f(x)f,,(x’) 
N“(x|x’) = 2 2 7 oy Roads ys 2 ’ (14) 
- YY =~ ¥,JAa, c 2; 
and 
. G?(x)G,,(x’) 1 (X)g,,(X") ‘ 
M"(x x’) = } 2 2)\2 —~ b 8 ‘ (15) 
; yo Faye vi N, 


For those problems where N or M is taken to be divergenceless, the sum over the f,’s 
or g,’s respectively, except for the terms which are also divergenceless, are not to be 


included. 
III. Variational principles. [In terms of the normal mode expansion of the dyadic 


Green’s function 


| f'-2’-E as fee) 
St .. > - F (16) 


(y° — ¥;) Sg & Q; 
where having set 
E B F 
and 
H, ne iid 
aay: | fz e’-EdS + (a,8 | x y Has |, 17 


Taking the curl of Eq. (16) yields 


J 


u’-H(x’) ony", ; : 
. Zz: x H,(x’) — H(z’). (18) 
be ; Y = ens 
We note again, that in those problems where E is divergenceless we shall omit the 


last sum in Eq. (16), except for the f, term which is also divergenceless. 
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Analogous to the foregoing we find 


_ [eetas 


.  (J5)"EXx) = w Mo . , 
- SS ; — 9 
E’(x) > <n ke 2 @ f(x) (19) 
and 
8] ve x (a 8 1/2 : e\a a/ 
H"(x)-u ms > ¥nBr) é (.J%,)"Hi(x) ~ Hx). (20) 


Mo n=0 (y one Yo) 


l’rom the wave equatéons satisfied by the electromagnetic field and the normal modes, 
we obtain 
V.-|E x d° x Ei + dd X E) X E} — jE. X (w’-H)) + Wy — yDJEV-E 
= w woE.-e’-E — jw(d* X E*)-(y’-H). 
It follows therefore that 


KLa@G=-f | E’-EdS/8,A° . (21) 


We can fix the amplitude of the field within the rod by restricting ourselves to fields 
satisfying the condition f Ej-E dS = (@,)'*A; , where EF, is the unperturbed guide 
solution whose perturbation we seek. Thus, 

Js = (y° — Y)(B)"”, (22) 


and utilizing Eqs. (16) and (18), it follows that 


(J6)°(Bo) AG = wo D5 , (23) 


where 
D; = | E*-e’-Eds + / | H-y’-y’-Has + [| H’-y-Hds 
. My « “ 


od > B, Adi (J) (24) 
why I WY — yn) : 


2 | f*-2’-E as|| | E’ -e’-f, as| 
, © Mo a ny - — . 
+e Q; 
We can now set up a variational principle from which integral equations (16), (18), 
(19), and (20) are derivable. Combining Eqs. (22) and (23) we obtain the expression 


(y° — yi)w Mo Fe ot (25) 
, = ——-* 25 
Ap D 
It is readily shown that the first order variation of the right hand side of Eq. (25) with 
respect to the fields, subject. to the condition 

Jé = (J5)* = Doe wo(Bo)'’? Ao” (26) 
yields the integral equations for the electromagnetic fields within the rod. Since expression 

(25) is amplitude independent, this condition will be automatically satisfied. 
Quite analogous to the foregoing we can obtain another variational principle yielding 
the integral equations one would obtain starting with Eq. (5). These integral equations 
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are 
2 ma -H dS 
-_, J"H,,(x’) 0 | E» ¥ | , - 
H(x’) = oe 7 7) = - 2 es meee °° oP (27) 
e’ -E(x’) = 2. eof oe) —— E(x’), (28) 
€o n 7 — n 


and their adjoints. 


Jy = 28 | H:-y’-H dS + (a,8,)* | Et-e”-E as 


QnAn 


iw [amas / ax . 


Restricting ourselves to solutions which satisfy the condition f H§-H dS = (a»)'’*d} , 


(29) 


we find 
J = (y° — Yo)(ao)*”* (30) 
and 
(J%)*(ao)"AG = w'eD5 , (31) 
where 


e 


D> = + [ Br-0'eE dS + ao: dS + | H’-»’-H dS 
€o J 
1 poausy 
w €5 n=1 (y’ ae Yn) 
| [ g.-u’ -H as|| | H’-w’-g,, as| 


ose J 


2 
n=O Nn 


The variational expression from which these integral equations for the fields within 
the rod are derivable is 
(y? — yo)w € J'o( do)" ; 
——— (32) 


Xo ~ DE 
IV. Application of the variational principle. In the present application of the vari 
ational principle, we are interested in demonstrating its usefulness in improving the 
results obtained by perturbation methods. Consider the problem of a rectangular wave- 





X—» o 


Fig. 1. A single ferrite slab in a rectangular waveguide. 
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guide containing a thin ferrite slab, transversely magnetized and placed against one of 
the sides of the guide (Fig. 1). The exact solution of the problem [7] leads to a complicated 
transcendental equation for the evaluation of the propagation constant. We shall obtain 
the first order perturbation of the propagation constant of the lowest propagating mode 


of the guide, the TE,, mode, for which 








0 3 ain = 
| r/L*"L 
(F,) = | — 10 sin =! and (aGo) = 0 
L L 
rx 
0 3 
cos L ] 


The transversely magnetized ferrite slab has a tensor magnetic permeability 


My 0 jK 
(uy) =v | O 1 Of, 
—K O wm 


where », and « are functions of*the frequency and the external DC magnetic field [8). 
The electric permittivity, ¢, is a scalar, and E therefore divergenceless. 
For first order results, the choice 
Evia = AF, , 
where A is a constant to be determined, is as good as any other. Its use in the perturba- 
tion formula 
> ay A? a a = 
(y° o= Yo) > = [ Fi-e-E dx -f [ Gi-y’-H dx = jo (33) 
W Mo a J 
which can be obtained from Eqs. (17) and (22), and is equivalent to perturbation formula 


(7) of [4], leads to the result 


> \ (ar - Ww My K (34) 


W Lo 
Ho 2k y 
ae so er As ’ 
w L(u; —«) a/b 
where 
F wx ” rx WX 
d,, = | sin’ — dz, die = sin — cos — dx 
' L - Jo L L ; 
+5 ’ 2 1 
2 Wx 2 why L 
do. = | cos — dz, and Ao = >a’ 
~ Jo L (x/L)~ 2 
We have set 
i — wy — K jo fs 
Hieriat = = 2 > Mo Gy = et J x Gp )ao 
M,; —K Mi — k 


and have kept only those terms which yield terms of the second order in 6 or less, except 
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for large values of e, where in the expressions containing ¢ we have kept terms yielding 
terms of the third order in 6 or less. We have also neglected the integrations with respect 
to the y coordinate as they all yield the same value or zero. 

We find the perturbation result to be dependent on the amplitude of the trial field. 
In this problem for which an exact solution exists, one is able to obtain an estimate of 
the amplitude. In general, however, we are not as fortunate, and in such cases Eq. (23) 
may be used to obtain a first order approximation of the amplitude. Setting E = Ae 
and H = 4h we find from Eq. (23) that 


(75)" Ao - 
A= d ae (35) 
W Uor’o 
where 
(jo) = e’-e’- F, dS + | h*-w’ (a G, )dS 
and 
At my. 


Combination of Eqs. (33) and (35) yields the variational expression 


ym +, (4;)° 
: . a . (36) 


Before evaluating D§ let us note that since the fields are independent of the y coordinate, 


we can find a closed dyadic Green’s function satisfying the differential equation 
d° x d’ X N(ziz’) — kU N(ziz’) = li(x — 2’), 


where 


Such a dyadic Green’s function is given by 
; ; ; l / 
N(x x Ni(xiz’) + Ni(v'v’) — j2(dd -N, — d‘d’-N.,), 


N,(z/\2x’) IN, ( x\x’) 


where the scalar Green’s function 


y) , l cos k(x — 2’ — L eee i 
Ve Lia : J 
2k sin kb } 
cos k(z2 — 2’ + L) x < 2x’ 
and satisfies the differential equation 
a” : ; 
5} MN, + KN, — d(x — 2’). 
dx 
No(xlzx') = —i2, + jie + KKM, 
where 
l 
M(x\x’) = cos k(a + 2’ — L) 


~ 2ksin kL 
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and 


> 


P=k-—y’. 
In terms of this dyadic Green’s function 
j j I F a , , 
Do | e’-e’-e dx + h’-w’-h dx + | h’-w’-w’-h dx 
. . Mo « 


I e-e’-N'(x\x’)-e’-e dx dx’ 


ve 


i | (h’-w’)-d’ X [d° & N(x\2x’)]"-(w’-h) dx dx’ 
fo Je 


+ jw I| (e’-e’)-(d° & N(vix’)]"-(u’-h) dx dz’ 


jw || (h’-p’)-[d & N‘(x!2’)]-(e’-e) dx dx 


W Ly *¢/ se\a 
; 2 Woo) J. 
+ GF at Yoo") 


Inserting the trial fields, we find that to first order 


i) 


. : ui, : (j3)" 
MM; — K 


or 


4 w BZE. 
My 


Thus 


, L (mi — K (ur — mn —K) (a) 2k (=) eas 


& 1 


It might be of interest to note that the amplitude one finds utilizing Eq. (35) is the 
limit of the exact amplitude as 6 becomes very small. 

In Vig. 2 we plot y for a wave at 9000 meps as a function of 6, as computed from the 
above equation, for a ferrite slab having a magnetization of 3000 gauss placed in an 
external DC magnetic field yielding an internal value of 1000 oersteds, and compare 
the approximate results with the exact calculation of Lax and Button [7]. 

While the full use of the variational method consists of the substitution of trial 
functions with unknown variational parameters and finding the values of these param- 
eters which make the variational expression an extremal, we see that we may at times 
obtain good first order results by simply substituting completely determined trial 
functions. In such cases the variational expression will yield better results than the 
perturbation formula with the same trial field. 
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Fia. 2. Propagation constant for the TE; guide mode perturbed by a single ferrite slab at 9000 mcps 


as a function of slab thickness. 
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ON TRANSFER FUNCTIONS AND TRANSIENTS* 


BY 
ARMEN H. ZEMANIAN 
New York University 


Abstract. In the first part of this paper the concept of the positive real function is 
generalized so that it is applicable to transfer functions and the functions, satisfying 
this generalized concept, are arranged into classes. Some tests are then developed which 
may be used to determine whether a transfer function belongs to a particular class. 
It is also shown that if transfer functions have certain general forms then they will 
automatically be members of one of the classes. Finally, several properties of the phase 
functions for such system functions are developed. 

The second part of the paper considers the impulse and step responses corresponding 
to these transfer functions. It is found that these transient responses are bounded and 
moreover the rise time and settling time of the step response are found to be greater 
than lower bounds which depend on the amount of the maximum overshoot (or under- 
shoot). These results are also generalizations of the restrictions developed on the transient 
responses of positive real system functions and are considerably stronger. As the difference 
in degree between the numerator and denominator of the transfer function increases, 
the magnitudes of these lower bounds also increase. 

Introduction. The question of what restrictions exist on the transient responses of 
various classes of networks has been treated in a series of papers [1-3]. In particular, 
bounds on the impulse and step responses have been given when the corresponding 
frequency responses are restricted to being of constant sign or monotonic in a semi- 
infinite interval. An example of this is that, if a rational system function, Z(s), is positive 
real and has one more pole than zero, then the absolute value of the impulse response is 
never greater than 1/C which is the constant multiplier of the system function. These 
results lead in turn to lower bounds on the rise time or settling time of the step response 
of such networks when the overshoot and undershoot is specified. 

Recently, one of these results has been improved by Ovseyevich [4]. The improve- 
ment is on the bounds developed on the impulse response of the positive real system 
functions in the interval (0, 7’) when the bounds on that response in the interval (7, @ ) 
are known. A related problem has been considered by Cutteridge [5] who determines 
the optimum rise time of passive two-terminal networks under various restrictions. 
For instance, considering only the step responses which are monotonic or have only a 
small overshoot, the optimum rise time is determined when the system function has 
only two zeros and three poles. 

This paper deals with a generalization of these results which is applicable to transfer 
functions. In particular, certain classes of functions are defined from the system func- 
tions having any number of poles in excess of zeros such that the corresponding transient 
responses are restricted in a manner similar to the restrictions existing on the transient 
responses of positive real functions. The strength of the restriction on the transient 
responses increases as the excess of poles over zeros increases. 

In the first part of this paper, these classes of system functions are defined. The 
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definition may be considered a generalization of the concept of positive real system 
functions which is suitable for application to transfer functions. Various tests are then 
developed which determine whether a system function is a member of any class. One 
immediate result is that, if the system function has poles only in the left hand complex 
frequency plane and no zeros anywhere except at infinity (that is, if the system function 
is the reciprocal of a Hurwitz polynomial), then it will be a member of one of the classes. 
Furthermore, certain properties of the phase functions for such system functions are 
also developed. 

In the second section the restrictions on the transient responses for the defined 
classes of functions are derived. These constitute bounds on the impulse responses and 
the step responses. From these, lower bounds are developed on the rise time and settling 
time of the step response when the overshoot and undershoot are given and, conversely, 
the specification of the rise time or settling time fixes the lower bounds on the maximum 
overshoot or undershoot. 

Part I. A generalization of the concept of positive real functions. The systems con- 
sidered in this paper are lumped, linear, fixed, finite and stable systems so that the system 
functions’ Z(s) have the following rational form where s is the complex frequency variable 
o + jw, the coefficients and the constant multiplier K are real numbers, and n and m 
are positive integers (m > n). 

Zs) = K s" + a,-.8 y te cos he _ a N(s) 

— s” + b,,-,s” , +---+b, D(s) () 

The term ‘stable system”’ is taken tg mean a system whose response will eventually 

become arbitrarily small once the input is removed. Thus the polynomial D(s) is a 
Hurwitz polynomial all of whose roots have a negative (non-zero) real part. 

The special classes of functions that are of interest here will be defined after a few 


preliminary remarks. Let 


Z,(s) = (—j)* | ds, , | ds, 2°? | Z(8o) ds, a (2) 
where the real parts of the complex variables s, , s, , --- , 8-1 , $ are all non-negative 


and q < m — n — 1. Letting s, = o, + jw, , the real and imaginary parts, R,(w) and 


I,(w), of Z,(jw) may be obtained from the real and imaginary parts, R(w) and I(w), 
of Z(jw) by (3) and (4). 
Z (Jw) 4 R,(w) ie qT (w) 


aw x » 


R,(w) = | dw, | . , 2 °°" | Rha) den (3) 


e x « “ 


nw nw, . 


ie w) = | dw, 1 | dw, a. 2S | I (wo) dar : (4) 

It is evident from the following argument that these successive integrations may be 
performed m—n — | times. Since Z(s) is analytic for ¢ > 0, the integral of Z(s) between 
two given points will yield the same result for all paths between these points which do 
not enter the left half s plane. Furthermore, the inverse power series expansion of Z(s), 


1These system functions may be impedances, admittances or transfer functions that are ratios of 


currents or voltages. 
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which holds for | s | > M, is 
in —~ K . 
Zs) = 2 3 (5) 
where K,,_,, = K and M is a positive number greater than the distance from the origin 
to the pole of Z(s) farthest away from the origin. Since this series converges uniformly 
for | s| > M, it may be integrated term by term yielding a series which also converges 
uniformly in the same region. This process may be continued g times where 
q < m — n — 1. Thus the successive integrations of Z(s) given by (2) yield unique 
functions Z,(s) which are all analytic for o, > 0. 
Now the aforementioned classes of functions Z(s) given by (1) are defined as follows. 
Definition. Z(s) will be called a class k function, where k = m — n, if one of the following 
inequalities holds for —7 <w< +o. Fork = 2v+ 1 (v = 0,1, 2, ---), 


(—1)’R,_,@) > 0 (6) 
and, fork = 2v (v = 1, 2, 3, -+-), 
(—1)’*'J,_,@) > 0. (7) 


The class k functions have the interesting property that they are related to the positive 
real functions in the region where they are defined (that is, in the right half s plane and 
on the imaginary axis). 

Theorem 1. If the system function Z(s) is a class k function, then (—j)"~"~' Zn —n-1(8) 
is a positive real function for ¢ > 0 and the constant multiplier K is positive. 

Proof. The inequalities (6) and (7) state that the real part of (—j)"-""' Z,,-n-1(jw) 
is non-negative for all w. Moreover Z,,.,,-;(s) is analytic for ¢ > 0 since Z(s) is analytic 
in this region. Thus by the minimax theorem, the real part of (—j)""""" Z,—»-:(s) is 
non-negative for ¢ > 0. Therefore to prove the first part of the theorem it need only 
be shown that (—j)" ""' Z,,_,-1(s) is real for s = o > 0. But 


(—7)""" Zm-n-1(0) = | iia [ a [ Ag, -n-3 °°" [ Za.) dao 


and the right hand side of this expression is a real function of a real variable. 
Finally the series expansion of Z(s) given by (5) may be integrated term by term q 





times according to (2) for g < m — n — 1. The resulting expression for Z,(s), which 
holds for |s| > A, is 
, K 1 
Z (8) = . —— #-_______ - . ° (8 
a J & Ge 1)(u — 2) >> (uw — @) g"=2 ) 


a ‘ . ° ° . . \m~-n-l £ * 
rhus the first term of the inverse power series expansion of (— j) Zn-n-1(8) is 


Bo 


(m —n— IIs’ 


where K,,_,, = AK. Moreover for s real and sufficiently large, this first term becomes the 
dominant term. Therefore K must be positive. This completes the proof. 

A property of the functions Z,(s) which will be used subsequently is the fact that 
their real and imaginary parts are either odd or even. This is stated by Lemma 2. To 
prove this however, Lemma 1 will be needed. 
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Lemma 1. Let h(w) be even (or odd) and integrable for —7 < w < +; let gw) = 
f2. h(u) du; and let qi) = 0. Then q(w) is odd (or even). 


Proof. 


q(o) = | h(u) du + | h(u) du = 0. 


d 2 w 


Therefore, 


q@w) = | h(u) du = | h(u) du. 


J—@ J « 


Replacing wu in the last integral by 


Il 


q(w) -| h(—v) dv. 
But for h(w) even, h(—v) equals A(v) so that q(w) equals —q(—w). (For h(w) odd, h(—v) 
equals —h(v) so that g(w) equals g(—w).) 


Lemma 2. For q even and less than m — n, the real and imaginary parts of Z,(s) are 
even and odd, respectively; for q odd and less than m — n, the real and imaginary parts of 
Z,(s) are odd and even, respectively. 


Proof. Consider a closed path of integration in the right half s plane for the integral 
£ Z,-;(s) ds which is composed of a straight line segment parallel to the imaginary 
axis and to the right of it by the distance c > 0 and a segment of a circle C, whose center 
is at the origin. 


pe+id 


p Z, (@)ds= [| — Z,.(8) ds + | Z,-1(8) ds. 

V¥e—jd JC, 
This integral is zero since Z,_,(s) is analytic for ¢ > 0. Moreover as the distance from 
the origin to the circular segment increases without limit, the integral along this circular 
segment will vanish so long as gq < m — n — 1, for then Z,_,(s) is o(1/s) as s > &. 
Thus for c > 0, 


acti@ 


| Z,-1(8) ds = 0. 


Hence the real and imaginary parts of Z(s) satisfy the hypothesis of Lemma 1. Therefore 
the real and imaginary parts of Z,(s) are odd and even, respectively. Furthermore 


they also satisfy the hypothesis of Lemma 1 so long as m — n is greater than 2. This 
application of Lemma 1 may be made m — n — | times to obtain Lemma 2. 


While the first theorem of Part II applies to all class k functions, the other theorems 
and corollaries hold only for certain class k functions. In particular, Theorems 10 and 
11 have been proved only for those functions in class m — n whose real or imaginary 
parts, given by the left hand side of either (6) or (7), are not only positive but are, 
moreover, monotonic decreasing for positive w. This condition is stated by the following 


expressions where k = m — n. When k = 2v + 1 (v = O, 1, 2, ---), 


(—1)’"R,-2@) > 0 for wo >0 (9) 


and, when k = 27 (vy = 1, 2, ---), 
(—1)’I,-.@) >0 for w> 0. (10) 
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Actually, any system function of the form of Eq. (1) which satisfies one of these inequali- 
ties will be automatically a class k function. For the function given by the left hand side 
of (9) or (10) is an odd function so that one more integration according to (3) or (4) 
will yield one of the inequalities of (6) or (7). 

Finally, it should be noted that a slightly different definition of the class k functions 
leads to a simpler form for Theorem 1. If the factor (—j)* in the right hand side of Eq. 
(2) is replaced by (— !)*, the class k functions may be defined by the condition that the 
real part of Z,_, (jw) is non-negative for all w. It may then be shown in this case that 
Z,_,(s) is a positive real function. However, the proofs of some of the theorems are 
simpler if the former definition is used. 

The subclass & functions. The question of determining whether a particular Z(s) 
is a member of any class without performing the required integrations remains. Several 
tests have been devised which are applicable to certain functions in a given class but 
not to all. These functions comprise the subclass k. 


Definition. Z(s) will be called a subclass k function, where k = m — n, tf, for k odd, 
R(w) has k — 1 changes of sign for —2 < w < + and R(O) is positive and, for k even, 
I(w) has k — 1 changes of sign for —7 <w< @ and dI/dw at w = 0 is negative. 


If a system function satisfies the conditions of the second definition then it will also 
satisfy those of the first definition and one of the inequalities of (9) or (10). Therefore, 
all the results of part II hold for all subclass k functions. 


Theorem 2. All members of subclass k are members of class k. Moreover all subclass k 


functions satisfy one of the inequalities given by (9) or (10). 


Proof. Virst consider the case where m — n = 2v + 1 (v = O, 1, 2, ---). The function 
R,(w) must have a smaller number of changes of sign in the interval —» < w < w;, 
where w,; is any zero of R,_,(w), than does R,_,(w). Moreover, invoking Lemma 2, it 
may be seen that R,(w) is odd or even when F&,_,(w) is even or odd, respectively, so that 
R,(w) has less changes of sign in the interval w; < w < © than does R,_,(w). Thus each 
integration removes at least one change of sign from the finite w axis. Moreover R(w) 


has m — n — 1 changes of sign and R,,_,-,(@) equals zero, so that each integration 
must remove only one change of sign. (Otherwise F,,_,-,(©) would not equal zero.) 
Therefore 2,,_,-2(@) has only one change of sign which is at the origin and R,,_,-,(w) 
has none. Since R(O) is positive, (—1)’ ' R,-»n-2(w) > O for 0 < w-< © and 
(—1)’R,, w) > Ofor —© <w< +o. This proves the theorem when m — n is odd. 

The same argument given in the preceding paragraph may be applied, when m — n = 
2vn(v = 1, 2, 3, ---), to obtain the remaining portion of this theorem. In this case, the 
fact that d//dt at t = 0 is a negative quantity implies that (—1)’J,,_,-2(@) > 0 for 
0 <w < @ and that (—1)""'/,,_,;(w) > 0 for all finite w. This completes the proof. 


The argument given in this proof yields a lower bound on the number of sign changes 
in the real or imaginary parts of a system function for real frequencies even though it 
may not be a class k function. This fact will be needed subsequently. 


Lemma 3. Any system function having the form of Eq. (1) must have at least m — n — 1 
changes of sign for R(w) if m — n ts odd and for I(w) tf m — n ts even. 


Proof. Otherwise R,,_,-,(@) and J,,_,-,(@) would not be equal to zero as they must. 
A system function having a positive constant multiplier and no zeros in the finite 
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complex frequency plane (that is, a function which is a reciprocal of a Hurwitz poly- 
nomial) will automatically be a subclass k function as stated by the next theorem. An 
example of a network whose transfer function is of this type is the RC ladder network 
which is considered by a number of authors [6-9]. The same form of transfer function 
holds for more generalized ladder networks [10]. 


Theorem 3. Let the system function Z(s) have the following form, 


K K 


Z( 8) = m ie m 1 “Nat ie = * \ ’ 
s” + 5b,,-.8 +-+--»- +6) D(s) 
where D(s) is a Hurwitz polynomial and K is positive. Then Z(s) ts a subclass m function. 


Proof. The proof depends upon a known property of Hurwitz polynomials, namely, 
that all the zeros of the even and odd parts of a Hurwitz polynomial of mth degree are 
simple and purely imaginary [11]. Thus, if m is odd, R(w) has m — 1 real, simple zeros 
and, if m is even, /(w) has m — | real, simple zeros. Furthermore, the series expansion 
of Z(s) is 

K_ Kb, 


Zs) = , > ° + 


0 


Since D(s) is a Hurwitz polynomial, all its coefficients are positive [12]. Therefore, 


K 
R(0) = =i > 0 


at | . et ee 


das b, 


Hence, all the conditions of the definition of a subclass m function are satisfied. 
Tests for a subclass / function. Two tests have been devised which may be used 

to determine whether a given system function is a subclass k function. These tests 

determine the number of real zeros in the real or imaginary parts of the system function 


whose expression for real frequencies is given by (11). 


=e - N(jw) D( — ju) 
lid) « K -SS™. (11) 
D(jw) 
Since all the roots of D(s) have negative real parts, | D(jw) |~ is positive and finite for 


all finite w. Thus, the zeros of the real and imaginary parts of Z(jw) are the zeros of 
the real and imaginary parts of N(jw) D(—jw). Let P(w°) be the even part of N (jw) 
D(—jw) and let wQ(w") be its odd part. Replacing the variable w* by x, the following 


test may be stated which is based on Descartes’ rule of signs [13]. 


Theorem 4. A system function Z(s) is a subclass m — n function if the following con- 
ditions hold. For m — n odd, the number of sign changes in the coefficients of P(x) is 
(m — n — 1)/2 and P(O) ts positive. For m — n even, the number of sign changes in the 
coefficients of Q(x) is (m — n — 2)/2 and Q(O) ts negative. 


Proof. First consider the case where m — n is odd. The number of real roots of R(w) 
equals twice the number of positive roots of P(x). By Descartes’ rule of signs [13] the 
number of positive roots of P(x) is less than or equal to the number of variations of 
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sign in the coefficients of P(x). So by hypothesis, the number of real roots of R(w) is 
less than or equal to m — n — 1. Moreover, by Lemma 3, the number of changes of 
sign for R(w), which is less than or equal to the number of real roots of R(w), is at least 
m — n — 1. Thus W (w) has exactly m — n — 1 real simple roots and the conditions for 
the definition of a subclass m — n function are fulfilled. 

The proof for the case where m — n is even is the same except that now the number 
of real roots of J(w) equals twice the number of positive roots of Q(x) plus one more. 

This theorem may be applied to determine another general type of subclass k function 


having only one zero as given by Eq. (12). 


, ‘ 8 + A -S + A 
Zs) = K s” + b,,-,8” ’ +.-- +6, a) D(s) (12) 
A condition that the coefficients of Hurwitz polynomials satisfy is 
a = 
o..4 °° > ce Le 8 (13) 
Ss D.~« 


The next to the last term in (13) is b,/b. when m is even and b,/b,; when m is odd. (In 
those cases where m = 3 or m = 4, these inequalities are a consequence of the Hurwitz 
criterion [14]. Professor C. F. Rehberg has proven that they hold for Hurwitz poly- 
nomials of any degree so that the conditions (13) may be omitted from the hypothesis 


of the following theorem.) 


Theorem 5. If a system function Z(s), having one zero, has the form of Eq. (12), tf its 
coefficients satisfy the inequalities (13) and if the postition of the zero satisfies b,,_, > ay > 
by /b, for m odd and b,,_, > ay > O for m even, then Z(s) is a subclass m — 1 function. 


! 


Proof. It will be shown that a system function which satisfies this hypothesis will 
also satisfy the hypothesis of Theorem 4. First consider the case where m is odd. Since 
m — 1 iseven, Q(x) is to be calculated and the number of changes of sign in its coefficients 
determined. 


m—3)/2 


Q(x) 7" 0. 1 — ax? + (adoba-2 — Dn-s)2 


+t ~ ah 4... 43 at | 


Now it is evident that if the conditions (13) are satisfied and if b,,; > ay > b)/b, then 
the coefficients of Q(7) have (m — 3)/2 changes in sign. But this number equals 
(m — n — 2)/2 for n equal to 1. Furthermore, Q(O) is negative for a, > b)/b, . Thus 
the hypothesis of Theorem 4 is satisfied. 

lor the case where m is even, P(x) may be found to be 


2)/2 


P(z) = in| (a — bn-1)x”? + (bn-3 — Aobdn-2)a" ”” 


+- (dybm—4 _ ee ee ie + eee ao aah | 


P(x) have (m — 2)/2 changes in sign which is (m — n — 1)/2 in number for n equal to 1. 
Furthermore, since D(s) is a Hurwitz polynomial, b) is positive. Therefore, P(0) is 
positive for a, > 0 and the hypothesis of Theorem 4 is again satisfied. 


Again, if the conditions (13) are satisfied and if b,,_, > ao > 0 then the coefficients of 
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Descartes’ rule of signs only yields an upper bound on the number of positive roots 
of a polynomial. Sturm’s theorem [13] which determines exactly the number of positive 
roots of a polynomial (a multiple root being counted as a single root) is the basis of 
the following test. 

Theorem 6. Let fo(x) equal P(x) for m — n odd and equal Q(x) for m — n even. Let 
f,(x) equal df,/dx. Let f.(x) be the negative of the remainder obtained by dividing f(x) 
by f,(x). Let fs(x) be the negative of the remainder obtained by dividing f,(x) by f.(x). Let 
this process be continued as indicated below until the last remainder f,(x) is a constant or a 
polynomial which never changes sign. 


fo(x) = qilx)fi(x) — f.(x) 


— 
—~ 
SS 

| 


q(x) fo(x) — fs(x) 


fi AX) = (x) fe (x) ad f(a). 


Finally, let Vo be the number of sign changes in the sequence f,(O), f:(0), --- , f,(0) and 
let V.. be the number of sign changes in the sequence f)(@), f:i(~), --- , f,(~). Tf, for 
m — n odd, Vo — V. equals (m — n — 1)/2-and P(O) ts positive or, for m — n even, 
Vo — V. equals (m — n — 2)/2 and Q(0) ts negative, then Z(s) is a subclass m — n function. 

Proof. The hypothesis of this theorem is simply a statement of Sturm’s test. For 
instance, when m — n is odd, the excess of V, over V.. is exactly the number of real 
positive roots of P(x). When this equals (m — n — 1)/2, R(w) will have m — n — 1 
real zeros and the conditions for a subclass m — n function will be satisfied. A similar 
situation exists when m — n is even. 


The phase functions of subclass / functions. The phase angle of a subclass /: function 
for real frequencies has several properties which will be developed in this section. This 


phase function ¢g(w) is defined by 
Z(jw) = | Z(jw) | exp [je(o)]. (14) 


Of course, any multiple of 27 may be added to ¢g(w) without affecting the value of Z(jw). 
In order to deal with a unique phase function, the following convention will be adopted. 
Consider the factored form of the system function given by (15) where the poles are 
denoted by the symbols p; and the zeros by the symbols x; . 


7 (s 23 a M2) 


‘(o — \ 
Us) = K sf = Bel, (15) 





(s _ p;)(s ome P2) eoe (2 — Pm) 


It will be presumed that at any real frequency the phase angle of any factor in this 
expression, which is determined by the angle of the vector extending from the pole or 
zero to the particular frequency in question on the imaginary axis, remains within the 
bounds 37/2 and —7/2. Thus as w increases from — © to +, the phase angle for a 
factor of a pole or zero in the left half s plane will increase from —2/2 to r/2 whereas 
this angle for a pole or zero in the right half s plane will decrease from 32/2 to 2/2. 
When the pole or zero occurs on the imaginary axis, the phase angle for its factor will 
be —2/2 when w is smaller than the pole or zero and 7/2 when w is larger. Such angles 
are illustrated in Fig. 1 where the symbol ¥; denotes the phase angle for the factor of a 
zero and the symbol 6, denotes this angle for a pole. As was stated previously in Theorem 
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Fig. 1. Illustration of the Convention for Measuring Phase Angles. 


| the constant multiplier K must be a positive number if Z(s) is a class k function. 
Thus by this convention the phase angle for Z(jw) is unique and it is given by 


¢g(w) = > io > 6; . (16) 


t=—1 i=l 


Another test for the subclass k functions may be constructed using the phase function 
¢(w). It also determines the number of real zeros that R(w) and J(w) possess. 


Theorem 7. If the system function Z(s) is given by Eq. (1), tf its phase function at real 
frequencies g(w) is continuous, if dg/dw < 0, and if —(m — n)x/2 < ¢g(w) < (m — n)w/2 


for —2 <w< o, then Z(s) is a subclass m — n function. 


Proof. Whenever the phase function ¢g(w) equals an odd multiple of 7/2, the real 
part of the system function at real frequencies R(w) equals zero and, whenever ¢(w) 
equals zero or a multiple of x, the imaginary part /(w) equals zero. Thus, under the 
conditions of the hypothesis, as w increases continuously from — © to +, g(w) will 
decrease continuously from +(m — n)r/2 to —(m — n)w/2 and the number of times 
R(w) or J(w) changes sign may be determined by counting the number of times ¢(@) 
passes through an odd multiple of 7/2 or a multiple of 7. When m — n is odd, R(w) 
has m n — 1 changes of sign and when m — n is even, J(w) has m — n — 1 changes 
of sign. 

Furthermore, the denominator of Z(s) is a Hurwitz polynomial so that its magnitude 
at, finite real frequencies is always finite and positive. Moreover Z(s) cannot have any 
zeros on the imaginary axis for otherwise the phase function ¢g(w) would have a discon- 
tinuity at each such zero. Thus the magnitude of Z(jw) is always finite and non-zero 
for finite w. Since ¢(0) equals zero (g(w) being an odd function), R(O) is positive. The 


fact that dg/dw < 0 at w = 0 implies that d//dw is negative at w = 0. 
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Therefore, all the conditions for the definition of a subclass m — n function are 
satisfied. 

The hypothesis of this theorem is really much too strong. The condition that ¢(w) 
is strictly monotonic decreasing may be replaced by the following conditions which 
encompass a much larger set of functions. For m — n odd, ¢(w) equals an odd multiple 
of r/2 exactly m — n — 1 times and, for m — n even, g(w) equals a multiple of m or zero 
exactly m — n — | times; moreover dg/dw < 0 at w = 0. In this case the proof is practically 
the same. 


It can easily be seen that this theorem may be used in an alternate proof of Theorem 3. 
Since all the poles of the system function which is the reciprocal of a Hurwitz poly- 
nomial are in the left half plane and since there are no zeros, the phase function ¢(w) 
is strictly monotonic decreasing and satisfies the hypothesis of Theorem 7. 

Several properties held by the phase function at real frequencies ¢g(w) of a subclass 
k function are stated by the following theorem. 


Theorem 8. If the system function Z(s), given by Eq. (1), is a subclass m — n function, 
then it satisfies the following conditions. 
i. Z(s) has no zeros in the right half s plane; that is, Z(s) is a minimum phase function. 
il. Z(s) has no zeros on the imaginary axis; that is, g(w) is a continuous function. 


) a “ — 


lil. —(m — n)t/e < gle) < (mM — n)t 2 for —© ay. os &, 


Proof. i and vi. It is convenient to prove the first two statements of the conclusion 
together. Assume that there are p zeros in the right half s plane and q zeros on the imagi- 
nary axis so that the number of zeros in the left half s plane is (n — p — q). All the 
poles are in the left half s plane. As w increases from — © to +, the angle 6, corre- 
sponding to the pole p; will increase continuously from —72/2 to r/2. The angles y, 
for any zeros in the left half s plane will behave similarly. However these angles for 
zeros in the right half s plane will decrease from 37/2 to 7/2. Finally a zero on the imagi- 
nary axis will have an angle for its factor which is +7/2 at all w except at the zero 
where there will be a discontinuity of magnitude z. 

Now consider the phase functions ¢’(w) determined by all the poles and only those 


zeros which are off the imaginary axis. This function is continuous. 


ew) = DVvi- De 

i l i 1 
The number of times R(w) or J(w) changes sign must be at least as great as the number 
of times the function g’(w) varies continuously and monotonically through odd multiples 
of /2 or multiples of z, respectively. For, the contribution to the phase function g(w) 
due to the zeros on the imaginary axis is a step function each of whose discontinuities 
is a multiple of z and such a contribution will only yield additional zeros to R(w) and 


I(w). The function ¢g’(w) varies continuously over a range at least as large as 


[m — (n — p — gq) + plz. The only way for R(w) or /(w) to have only m n— 1 
changes of sign is for this range to be no greater than (m — n)x. Thus both p and q 


must be zero. 

tit. By the above argument, all the poles and zeros of Z(s) are in the left half s plane 
so that ¢(0) equals zero. Moreover it has been shown that the range of variation for 
¢y(w) can be no greater than (m — n)x. Thus ¢(w) is an odd function and it is bounded 


by +(m — n)r/2 for —~ <w < +o-., 
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Part II. Bounds on the impulse and step responses. As is well known, the unit 
impulse response W(¢) is related to the system function by the Fourier transform 


W(t) = = | Z(jw) exp (jot) dw. (17) 
i a 


It will be presumed that all input functions are applied at ¢ = 0 so that the response 
for any physically realizable system must be zero for negative values of time. This 
result may also be derived from the condition that Z(s) has no poles in the right half 
s plane. Because of this, W(t) may be represented either by Eqs. (18) or (19) for positive 


values of time. 


9 a= 
W(t) = - | R(w) cos wt dw, t>0 (18) 
T Jo 
' arf P 
Wi) =-= |] Mw)sinetdo, t>0 (19) 


70 


It has been shown previously that when the frequency responses of networks are 
restricted in various ways the transient responses are bounded [1-3]. Specifically, when 
the real part of a system function is of constant sign and the system function has one 
more pole than the number of its zeros, the magnitude of the corresponding unit impulse 
response is bounded by the constant multiplier AK of the system function. Similarly 
when the imaginary part of a system function with two more poles than zeros is of 
constant sign for positive frequencies, then the magnitude of the unit impulse response 
is bounded by At. The restriction on the frequency characteristic of a class k system 
function leads to a generalization of these two conclusions. This generalization, which is 
given by the following theorem, states that the magnitude of the unit impulse response 
is bounded by Kt"”""'/(m — n — 1)!, the symbols being defined in Eq. (1). (By the 
initial value theorem, the upper bound is seen to be the initial value of the unit impulse 
response.) It follows that the magnitude of the unit step response will be bounded by 


Kt" "/(m — n)! so that the rise time from zero to one will always be greater than 
[(m — n)!r/K] " where r equals Kao/b) (presuming that a) > 0). 
Theorem 9. If the system function Z(s), given by Eq. (1), is a class m — n function, 


then the corresponding unit impulse response W(t) is bounded by the following expression 
jor lt m8 


; ai 
Wil < ° (20) 
(m —n — 1)! 

Proof. For m n = 2v+1(v = 0,1, 2, ---), upon integrating by parts expression 
(18) 2v times and using the fact that R,(~) = 0 for q < m — n, the following may be 
obtained 

W(t) = (-))’t | R.,(w) cos wt dw. (21) 
qT “0 
But by definition of a class m — n funetion (—1)" R.,(w) > 0 for all w and so 


a x 


! wylv 2 
W(t)| <(-N’e | R(w) de. 
us “0 
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By integrating Z,,(s) around the o > 0 half plane, the integral, {> R.,(w) dw, is found to 
equal (—1)” K2/2(2v)!. Thus the conclusion of this theorem is obtained for the case 


where m — n is odd. 
For m — n = 2r(v = 1, 2, 3, ++), a similar argument may be applied to (19). Inte- 
gration by parts 2v — 1 times yields 
W(t) = (—1)"**?" "= | T.,-:(w) cos wt dw. (22) 
T ~¥W 


Again by a definition of a class m — n function, (—1)’*' J,,_,;(w) > 0 for all w so that 


2 ff 


Wid) l< (—1)"""""" | T,,-,(w) dw. 


T . 
Integrating Z,,_,(s) around the o > 0 half plane, the integral on the right hand side 
of the last expression is found to be equal to (—1)’** Kx/2(2v — 1)!. This yields the 
conclusion once more. 
The next two theorems depend upon two inequalities that the sine function satisfies 
and which were proved in a previous paper [3]. 


Lemma 4. For 0 < y <1,x > Oand Na positive ante ger, 


N 
; oo . x 
sinz < Qor + -. Yer” sin = (23) 
p=1 Pp y 
and 
: a! ee 
sn z > —Q or + 2. (—1) sin : (24) 
p=1 P Y 
where 
‘ Te 
(—s)(1* — 9?) --- [& — 2)? — 
Q 14 x (—y) y) ed l y | . (95) 
k=1 v! 
awe (—y)(? —7)--- (k-1% -—y 
Q., = (—1)"2 >> — ys . p01, %+,¥. 


(k — p)\(k + p)! 


Any system function which satisfies the inequality (9) or the inequality (10) (as 
has been noted before, all subclass k functions are of this type) will have the property 
that, when its corresponding impulse response is bounded beyond a certain time, then 
other bounds on this response will be determined before this time. The physical signifi- 
cance of this is that the more rapidly an impulse response “‘settles down”’ the less violent 
must this response be. This result is given by the next theorem which is a generalization 


of Theorems 1 and 2 in [15]. More precisely, the conclusions of [15] are special cases of 
the following obtained by setting m — n equal to one or two and making some trivial 


changes in the notation and normalization. 


Theorem 10. Let the system function Z(s), given by Eq. (1), satisfy the inequality (9) 
if m — nts odd or the inequality (10) if m — nis even. If the magnitude of the corresponding 
unit impulse response W (t) ts less than or equal to M for t > r where M is a positive number, 
then forO < y <1, 


—— K sin ry 2My”" "sin ry < | 
Wyr) | <7 , (yr)""" + : “ro ; “te (27) 
y —~ (m — n — 1)!ry yt) t T 2 . ’ ” 
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Proof. First consider the case where m — n = 2v + 1 (vy = O, 1, 2, ---). Integrating 
(18) by parts 2v — 1 times, (28) is obtained. 


Wit) = (-yre f Rod ain wt des. (28) 


By hypothesis, (—1)’"* R,,_,(w) is non-negative. Therefore, setting x equal to wt, the 
sine function may be replaced by the right hand side of (23) and the result integrated 


term by term to yield 
; timirT (fy) wpe 
WO) < (-1)"8'Qo= | wR, slo) do + DY (4) @.,0("). 
a 0 p=1 Pp y 
Moreover, 


[ oR.) do = -—[ Re(e) de. 
70 0 
The value for this last integral may be found by integrating Z,,(s) around the ¢ > 0 
half plane 





: _ (=1)""'*K 
-| R.,(w) dw = 2(2»)! , 
Therefore, 
ae a (¥)” (2) 
Wi) < (2)! + > Q.,Vb P (29) 


Furthermore, the W(t) are bounded for all ¢ since 
ee a a oe ;, 
|W | <= [ Bed lide < @. 


Thus it may be found that the double series obtained by letting N go to infinity in the 
right hand side of (29) converges absolutely for 0 < y < 1. So summing over the k in 
Eqs. (25) and (26), the following may be obtained where the Q., converge to the qo, 


as shown in [16]. 


; K qot”’ . (¥)” (2) 
VW (t) < et sal 2 ~ a? t 
< “or + > ; go,Wi : (30) 
where 
q, — ary 
~~ = ry 


en = (1 See. 
mp — y) 
Now if t/y is set equal to r and W(pt/y) is replaced by M for p odd and by —M for p 
even, the upper bound of (27) is achieved. The lower bound is similarly obtained by 
use of (24) rather than (23). 
The conclusion of this theorem may also be achieved in the case where m — n = 2p 
(v = 1, 2, 3, ---) by integrating (19) by parts 2» — 2 times and then proceeding in 


the same way. 
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Finally bounds which are similar to those of Theorem 10 exist on the step responses 
of those system functions which satisfy either inequality (9) or (10) and whose constant 
a, is greater than zero. Once more, it may be noted that all subclass k functions are of 
this type. The unit step response A(d) is related to the unit impulse response W(t) by 
the expression, 


asl 


A(t) = | WE) dé. (31) 


The following theorem is once again a generalization of Theorem 3 of [15]. 


Theorem 11. Let the system function Z(s), given by Eq. (1), satisfy the inequality (9) 
af m — n is odd or the inequality (10) if m — n is even and let ay be positive. If the cor- 
responding unit step response A(t) is bounded by (1 + y)r fort > r where rr = Ka,/b, > 0 
and y is a positive number, then, forO < y < 1, 


get aa! , ( > m—n : x aiy* x 1 ) 
sind & y inwy) Kr" 4 Dy? > : ( )* a cae . __] ; (32) 


vs \((m — n)! iv y—y) =r (— vf) 
and 
m-n-1 _* - m—n x ) 
y sin ry J Kr a l = 
A(yr) > - ~ .- 2y7r(1 — vy) > iimcrocea a ( (33) 
T \(m — n)! iv UCU YY) 

Proof. To obtain the upper bound (32) for both cases where m — n is odd and m — n 
is even, the proof proceeds in the same way as in the preceding theorem until the in- 
equality given by (29) is obtained where 2y is replaced by m — n — 1. Integrating 
according to (31), 

KQ,t"" —~ fy\""" at 
A(t) < -+ > (2) Q,,A(©). (34) 
(m— n)! p=1 \P 4 


However, 
2t f 
A(t) | < | Rw) | dw for t> 0. 
W . 


Thus A(t) is bounded for 0 < t < @ and so it may be seen that the double series ob- 
tained by letting N go to infinity in the right hand side of (34) converges absolutely. 
Upon letting NV go to infinity and summing over the k in Eqs. (25) and (26), the following 
may be obtained where the q,, are given below expression (30) 


Kqot” =~ /y\"~ pl 
A(t) < + (4) g,A(™). (35) 
— (m—n)! 2d p/ f ‘vig 
Now setting ¢/ equal to r and replacing A(pt/y) by (1 + y)r for p odd and by (1 — y)r 
for p even, expression (32) is achieved. 


I’xpression (33 may be derived in a similar fashion using (24). In this case, the 
expression corresponding to (34) is the following 


At) > = KQot” 4. > (-1) ( 2)" QAP), 


(m — n)! = Dp y 


As N goes to infinity the double series converges absolutely again. Also the A (pt/y) 
must be replaced by (1 — y)r for all p to maintain the inequality. This completes the 


proof. 
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For m — n equal to one or two, some of the infinite series given in (27), (32) and 


(33) may be expressed in closed form as follows. 


1 1 T 
———. = —> aim < 
» y Oy? 2y cot ry, 0 y < i 


y=1 
] 


me” a —[¥y) + ¥(-y) + 2g], O<y<l. 


ay" > vy 


Here V(y) is the digamma function [17] defined by 
Vy) = J log Ta 
oe dy og T(y) 


and g is Euler’s constant defined by 


: ~~ | EAST 
g= lim ( ta log n) = O72 -°°. 


n—-0 ve=l 


The results of Theorems 10 and 11 and the following corollaries lla and 11b are 
not the best possible and can be improved. For it was presumed in the proofs that 
W (pr) and A(pr) equal their bounds +M and (1 + y)r for all positive integers p. This 
behavior is impossible since W() equals zero and A() equals r. 

Restrictions on the rise time and settling time. Theorem 11 yields restrictions on 
the shape factors of the unit step response which are again generalizations of results 
that have appeared previously [3]. Defining the rise time 7’, as the time it takes for the 
unit step response to first cross the final value line after the input has been impressed 


(see Fig. 2), it is found that a lower bound exists on this rise time which becomes larger 


A(t) | 








Cl ce ee ces es see tn eater ce 





4 

* 
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o 


Fig. 2. Illustration of the Shape Factors for the Unit Step Response. 


as the maximum overshoot or undershoot y becomes smaller. That is, the rise time and 
maximum overshoot or undershoot of the unit step response define a point on the plane 
of Fig. 3 and this point must lie above the curve for the appropriate m — n. Further- 
more, let 7’; be the least time at which the unit step response crosses the final value 
line and beyond which the overshoots and undershoots are less than or equal to 6. (Some 
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of the overshoots and undershoots may be greater than 6 before 7’;). In this case the 
curves of Fig. 3 still apply giving lower bounds on this shape factor 7; . These results 
may be stated as follows. 


3.8 





= 
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Fic. 3. The Envelopes of the Lower Bounds on the Rise Time of the Unit Step Response Given by 
( orollary lla. 


Corollary 1la. Let the system function Z(s), given by Eq. (1), satisfy the inequality (9) 


if m — n is odd or the inequality (10) if m — n is even and let ay be positive. If the cor- 
responding unit step response A(t) is bounded by (1 + y)r fort > T, where r = Kay/by , 
y ts a positive number, and A(T',) = r, then 
/ - , va 
p, > {imam | me _ gyernes $e ED 
' K sin ry ei = yy" "v — y) aa 
(36) 


x 1 \V m—n) 
— nym" t2 . 
ties 2X , wo val 
whereO < y <1. 


This corollary follows immediately from expression (32) if yr is set equal to 7’, so that 


A(yr) equals r. Since A(~) = r, equality between the two sides of (36) is impossible. 
For a given y and m — n, the right hand side of expression (36) is a function of y. 


As the parameter y is varied between zero and one, a family of curves is generated on 
the plane of Fig. 3 and 7, must be greater than the envelope of this family of curves. 
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These envelopes for several values of m — n are shown in Fig. 3, where 7'y is a normaliza- 
tion factor given by 








r 1/(m—n) 
T, = ( 5) | (37) 
As y approaches one, the envelopes approach a point on the ordinate axis given by 
2(m — n) + 1 . (—1)’ | 
a fe oot” ; 
( m n) | 2 + 2 dX yy" ~~ 1) (38) 


As y approaches zero, the envelopes progress infinitely to the right approaching the 
horizontal line whose ordinate is (m — n)!. 

A settling time r; for the unit step response may be defined as the least time beyond 
which the unit step response remains within the bounds (1 + 4)r. This is illustrated in 
Fig. 2. Again Theorem 11 implies a lower bound on 7; as shown in Fig. 4. For a given 6, 
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Fia. 4. The Envelopes of the Lower Bounds on the Settling Time of the Unit Step Response Given by 
Corollary 11b. 
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the settling time 7; must be greater than the curve corresponding to the appropriate 


m — n. These curves are an immediate consequence of corollary 11b. 

Corollary 11b. Let the system function Z(s), given by Eq. (1), satisfy the inequality (9) 
if m — n is odd or the inequality (10) if m — n is even and let ay be positive. If the corre- 
sponding unit step response A(t) is bounded by (1 + 4)r fort > +; where r = Ka,/b) and 
5 is a positive number between zero and one, then 


( ' x v+l 
m— n)!r TY “ ae (—1]) 
™ > { K | = (l— 3) — & : z mance WP) 
K sin ry re y= , 
\ ( 1 Y | (39) 


icine x 1 1/(m-—n 
— 25y" """ pa cs ee , 
an r= 2 ) 


whereO < y <1. 


Setting yz equal to 7; , A(yr) may be replaced by (1 — 6)r, r by 7;/y and y by 4 in 
expression (32). This will yield expression (39) upon rearrangement. Again A(~) = r 


so that equality between the two sides of expression (39) cannot be achieved. 

Once again, a family of curves is generated on the plane of Fig. 4 by the right hand 
side of expression (39) when the parameter y is varied between zero and one. The settling 
time 7; must be greater than the envelopes of these families of curves for various values 
of m — n. These envelopes are shown in Fig. 4. As y approaches one, the envelopes 
approach a point on the ordinate axis which is given by expression (38) and, as y ap- 


proaches zero, they approach the value, one, on the abscissa axis. 


APPENDIX I 


Examples and applications. 1. As an illustration of the bounds holding on the impulse 
response of a system function which is the reciprocal of a Hurwitz polynomial, the follow- 


ing system function is considered. 


1.241 
Zs) = — anaes 
(s + .591)° + (.806) 


The corresponding unit impulse response is 
W(t) = 1.537e"°°*"* sin .806¢ 


and this is plotted in Fig. 5. Since Z(s) is the reciprocal of a Hurwitz polynomial of 
second degree, it is automatically a Subclass 2 function by Theorem 3. Thus all the 
bounds developed in Part II apply to its corresponding unit impulse and unit step 
responses. The bound, 1.241/, on the magnitude of the unit impulse response given by 
Theorem 9 is readily seen to hold. Furthermore, as seen in Fig. 5, | W(t) | is less than 
.063 for all ¢ greater than 3.486. Thus by Theorem 10, | W(yr) | is bounded in the interval 
0 < t < 3.486 and these bounds are also shown in Fig. 5. Of course other bounds on 
| W(t) | could be obtained by choosing other possible values of M and r. 
2. An example of a Subclass 4 function is the following system function 
(s + 1.25)* + .5625 


“3) = GF Det Die+ I+ lle +3 +9) 


To show that this is indeed a Subclass 4 function, the function Q(x), which is defined 
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Fia. 5. Illustration of the Bounds of Theorem 10 in the Case Where K = 1.241, 1 = 0.063, m — n = 2, 
and rt = 3.486. 
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Fic. 6. Bound on the Unit Step Response Discussed in the Second Example. 
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immediately before the statement of Theorem 4, may be found to be 
Q(x) = 8.500x* — 38.382° — 66.00x — 253.5. 


Since the coefficients of Q(x) have only one change in sign and since Q(0) is negative, 
Z(s) is truly a Subclass 4 function by Theorem 4. Therefore all the bounds given in 
Part II hold on the transient responses corresponding to this Z(s). 

In particular, the unit step response is 


A(t) = .0295 — .048le~* + .0281e~** + .0247e~‘ cos (t + 1.89) 
+ .0064e~*' cos (3¢ + 4.48) 


and this is plotted in Fig. 6. The bound, ¢*/1.413, on the magnitude of A(t)/r, which is 
a consequence of Theorem 9, is shown by the dotted curves. Moreover pairs of values 
for the variables 7; and 6 are seen to be well above the curve for m — n = 4 in Fig. 4. 
For instance, for ¢ greater than 2.60, A(t)/r is bounded by (1 + .129). Moreover the 
normalization factor 7’'y has a value of .429 for this example. The point determined by 
7;/Ty = 6.06 and 6 = .129 is above the appropriate curve in Fig. 4. 

3. A network [18], which has satisfactory responses for appropriate choices of the 
network parameters is shown in Fig. 7. Both the series-shunt peaking network and the 


i (t) 


ae 





. + 
LU00™ —o 
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Fig. 7. A network whose transfer function of output voltage E.(s) divided by input current J(s) is 
always a Subclass 3 function for all positive values of C, , C: , and L and all driving point impedances Z 


having a finite, non-zero value at DC. 


Dietzold network have this form and their transient responses of the output voltage 
e,(t) as a function of the input current 7(¢) are plotted in [18]. 

It is easily seen that if the driving point impedance Z(s) has a finite non-zero value 
at DC, then the transfer function Z,(s) of output voltage Z,(s) divided by input current 
I(s) will always be a Subclass 3 function for all positive values of C, , C, , and L and 
for all permissible driving point impedances Z(s). For, the ratio of the output voltage 


to input voltage F,(s) is 





E,_ 
E, °F Lt oa > l 
But since E,(s) = I(s)Zp(s) where Z,(s) is the input impedance of this network, 
’ E,E 1 = 
Z (s = —2 = — ,——— 7 (8 . 
Zr 8) E, I LCs’ re 1 Zi S) 
Thus, 
Rp( 
Rid = Re 8d) < 2, 
1 — LCw 


Since Rp(w) is the real part of the passive driving point impedance Zp(jw), it is non- 
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negative for all w. And so, Ry(w) will have exactly two changes of sign. This means that 
Z,(s) is a Subclass 3 function provided that Rp(O) is non-zero and finite. 

An inspection of the unit step responses corresponding to Z,7(s) for the special cases 
of the series-shunt peaking network and the Dietzold network [18] shows that these 
transient responses satisfy all the restrictions developed in Part II of this paper. It 
should be noted that these networks are suitable for video amplifiers where a small rise 
time from 10 to 90 per cent coupled with small overshoot is of importance. If a small 
initial time delay is also desired, then these networks would not be particularly useful 
since they have Subclass 3 transfer impedances and their step responses must have 
appreciable initial time delays. 

APPENDIX II 

List of symbols. Those symbols which have physical significance are defined as 

follows: 


A(t), the response to a unit step function applied at time, t = 0; 


a ; the coefficients in the numerator of a system function; 
b, , the coefficients in the denominator of a system function; 
D(s), the denominator of a system function; 


f,(x), the Sturm functions defined in Theorem 6; 
I(w), the imaginary part of a system function for real frequencies; 
I,(w), a function defined by Eq. (4); 


K, the constant multiplier of a system function; 

m, the degree of the denominator of a system function; 
N(s), the numerator of a system function; 

n, the degree of the numerator of a system function; 


P(w°), the even part of N(jw) D(—jw); 
Q(w), the even part of N(jw) D(—jw)/w; 


R(w), the real part of a system function for real frequencies; 

R,(w), a function defined by Eq. (3); 

r, the resistance of a system function under DC conditions; 

s, the complex frequency variable; 

Ty, atime normalization factor defined by expression (37); 

Tae the rise time from zero to one of the step response; 

t, the time variable; 

W(t), the response to a unit impulse function applied at time, ¢ = 0; 

x, the square of angular velocity = w’; 

Z(s), asystem function; 

Z,(s), a funetion defined by Eq. (2); 

Y, the least upper bound on all the fractional overshoots and undershoots of the 
unit step response; 

5, the least upper bound on | A(t) — r|/r for t > 7; ; 

6; , the phase angle for a pole factor defined in Fig. 1; 

ig a zero of a system function; 

Pi; a pole of a system function; 

o, the real part of the complex frequency variable; 

ty any time beyond which the unit step response remains within the bounds 


(1 + y)r; 
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v3; the least time beyond which the unit step response remains within the bounds 
(1 + 6)r where0 < 6 <1; 
¢(w), the phase angle of a system function defined by Eq. (14); 


v;., the phase angle for a zero factor defined in Fig. 1; 
w, the imaginary part of the complex frequency variable, s. 
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SOME NEW TECHNIQUES IN THE 
DYNAMIC PROGRAMMING SOLUTION OF VARIATIONAL PROBLEMS* 


BY 
RICHARD BELLMAN 
The Rand Corporation, Santa Monica, Cal. 


Summary. In previous papers, it has been shown that the functional equation 
technique of dynamic programming may be applied to yield the numerical solution of 
a wide class of variational problems of the type occurring in mathematical physics, 
engineering, and economics. 

It was seen that the numerical solution of a problem involving N state variables 
depended upon the computation of sequences of functions of N variables. This fact 
made the method routine only for the case where N = 1 or 2, with grave difficulties 
arising in the general case. 

In this paper, it is indicated how this difficulty can be overcome for a large class of 
problems in which the underlying equations and criterion function are linear, although 
the restraints on the forcing functions may be non-linear, corresponding say to energy 
considerations. 

The same methods are applicable to other classes of linear equations, and, in par- 
ticular, to differential-difference equations, arising from time-lag problems and to 
various classes of partial differential equations. These problems could not previously 
be treated by dynamic programming techniques in any usable fashion. 

Finally, it is briefly indicated how the method of successive approximations may be 
combined with the foregoing techniques to reduce general variational problems with 
non-linear equations and criterion function to sequences of problems that can be solved 
numerically by means of sequences of functions of one variable. There are a number of 
interesting and difficult convergence questions associated with this program; these, 
however, are not discussed here. 

1. Introduction. A variational problem that is encountered in many areas of pure 
and applied mathematics is that of determining the minimum or maximum of a func- 
tional of the form 


»T 


J(y) = | F(z, , 225 °** »Ee it» Wes°** 9a a (1.1) 
over all functions y; , Yo, -** » ¥w , connected to the x; by means of the relations 
dx; , , P : 
it = GAG . Ze. *** » Ex oe ail > Um); x,(0) =C, t= oe 2 rs N, (1.2) 
rf 


and satisfying constraints of the form 
(a) 2(T)=b;, t=1,2,---,k, (1.3) 
(b) R(x; y) < O, j=1,2,°-+ ,m. 


For a variety of reasons, which preliminary mathematical study discloses quite readily, 
this problem, to the degree of generality stated above, presents formidable analytic 


*Received Sept. 23, 1957. 
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difficulties. Not only do these difficulties arise in connection with the explicit solution 
of the problem, see [9] and [10], but also in connection with the apparently more modest 
demand for a numerical solution. 

We have discussed elsewhere, [1, 2, 3, 4], various applications of the functional 
equation technique of dynamic programming to the numerical solution of classes of 
problems of the foregoing type. In this paper, we wish to indicate some recent develop- 
ments which greatly enlarge the scope of the methods utilized in the references cited 
above. These new developments, combined with the classical tool of successive approxi- 
mations, enable us to attack systematically large classes of problems formerly far beyond 
our powers. 

We shall begin our discussion with a terminal control problem for a linear system 
with constant coefficients. The problem we consider is that of maximizing a functional 
of the form 


J(y) = H[x(T), x(T), --- , x,(T)] (1.4) 


over all functions y; related to the x; by means of the linear differential equations 


d: ad , 
= = 2 aise; oo Fig. z(0) =¢;, b= 3.2. «+ (1.5) 
=1 


dt fal 
and satisfying constraints of the following type: 


(a) m; < y(t) < mi, = ¢<. 7 t+=1,2,--- ,N, 


gl . 
[ Ki(y: , Yo ,°°* > yn) dt S¢;, 9=1,2,---,0. (1.6) 
Jo 

Whereas the techniques previously given enable us to convert this problem into one 
involving the sequential computation of functions of N variables, the linearity of the 
defining equation in (1.5) permits, as we shall see, a transformation of the problem 
into one involving sequences of functions of k variables, where k is as in (1.4). 

Since current digital computers do not allow the storage of functions of more than 
two variables in any feasible way, this is a very important reduction if k is equal to one 
or two. As it turns out, a number of significant problems arising in the fields of economic 
and engineering control may be formulated in these terms. 

If the function H(a, , 22, *** ; 2) is linear, 


k 
H(a, ,%3,°** ;%) = d a2; , (1.7) 
i=l 
the problem may be still further reduced to a computation involving sequences of func- 
tions of one variable, regardless of the values of k and N. The same result holds for the 
case where it is desired to maximize a linear functional of the form 


Se [[ S vee | at. (1.8) 


In exactly the same way, we can treat discrete variational problems where difference 
equations replace differential equations. As a matter of fact, these techniques were 
developed in connection with a discrete problem, the “caterer” problem, [5]. 

As we shall see, the same methods are applicable to the treatment of variational 
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questions involving time lags and retarded control, see [7]. In simple cases involving 
only one time delay, the equation corresponding to (1.5) is 


dx; x = x ~ 
— p is a;;x;(0 + 2, aii(t - + 2 bid, (1.9) 


a(t)=c(t), O<St<6, t=1,2,---,N. 


More complicated types of hereditary processes and various classes of partial differential 
equations may also be treated by means of the techniques we present below. Formerly, 
problems of this nature could not be treated computationally by dynamic programming 
techniques because of their dependence upon functionals rather than functions. 

As noted above, the results discussed in the foregoing paragraphs depend in an 
essential manner upon the linearity of the equations defining the process. To extend 
these techniques to cover more general processes, we turn to that general factotum of 
analysis, the method of successive approximations. Without entering into any of the 
questions of rigor, we indicate briefly how a variety of apparently multi-dimensional 
problems can be reduced to sequences of problems involving functions of one variable. 

Although a large number of interesting and important questions concerning con- 
vergence, rapidity of convergence, stability, and so on, arise from these investigations, 
we shall postpone any investigation of these matters until a later date. 

Similar techniques can be used to treat processes involving random effects. These 
will also be discussed subsequently. 

2. Dynamic programming. In order to appreciate the improvement in technique 
afforded by the methods we present here, let us sketch briefly the direct approach of 
dynamic programming to variational problems of the aforementioned variety. 

To treat the general question posed in (1.1)-(1.3), we define a function of N variables, 
fle, , @ , «++ , ey ; T), by means of the relation 


fle, ,¢2 , +++ ,¢y ; T) = max J(y), (2.1) 


where the maximum is taken over all functions y;(t) satisfying (1.3). Under appropriate 
assumptions concerning the continuous dependence of the maximizing functions upon 
the initial values c; and the upper limit 7’, we obtain for the function f a non-linear 
partial differential equation 
‘ 
of = , Of 
—. = max | Fc, v) + G,(c, v) =— 
1 os | 2, dc, (2.2) 
fC; »€2 5 °** yey 30) =O. 
The maximization in the above expression is over quantities v; satisfying the constraints 
(a) m <0,< m, i=m1,3.--- NW 
—_— —_— ? ’ ? (2.3) 


(b) R;(c,v) < 0. 

If F and G satisfy suitable differentiability conditions, this non-linear partial differ- 
ential equation leads, in the absence of constraints, to the usual Euler equations, via 
the method of characteristics, see [1, 2]. 

In general, in questions of greatest interest in applications, this reduction from partial 
differential equations to ordinary differential equations does not occur due to the presence 
of various constraints of a physical nature. 
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3. Computational aspects. Since the analytic solution of problems of the type 
discussed in the foregoing sections is only rarely achieved, we turn to the study of com- 
putational techniques. Our aim initially is to provide a numerical algorithm, based upon 
the use of a digital computer, which resolves questions of this nature in a routine fashion 
without the use of an extensive analysis strongly dependent upon the particular analytic 
structure of the functions occurring. In practice, simplifications can always be made, 
taking advantage of the structure of the particular problem under consideration. As 
we shall see below, it is only by proceeding in this manner that we can make significant 
advances. General methods, however, are always useful, if only as a court of last resort. 
To determine f(e, , ¢: -- ,¢y, 7), we can either use (2.2) and any of a number of 
standard techniques for the numerical solution of partial differential equations of this 
type, or, as has turned out to be preferable, we can go over to a discrete version of the 
original continuous process. Analytically, this means that the original differential equa- 


tions are replaced by difference equations. Thus, (1.2) becomes 
a(i+ 6) = 2;(t) + 6G, [x,(), ZAt), °° , tvlH3 yO, ylD, °°: ,; yu(t)], (3.1) 
\o. 


r(0) =c¢; , pom §, 3, ++* NK; 


i 


with ¢ assuming only the values 0, 6, 26, 
The non-linear partial differential equation is then replaced by the non-linear re- 
currence relation 
f(c, T) = max [6F(c, v) + fle, + 6Gi(c, v), «++ , ey + bGy(c, v); T — 4]], (3.2) 
» 0.4 
f(c, O) 0. 


In order to carry out the indicated process, we must be able to tabulate functions 
of N variables. This means that at the present time the approach outlined above is 
only feasible if N = 1 or 2. For N > 2, the memory requirements become prohibitive. 

It is necessary then to develop some new techniques if we wish to utilize dynamic 
programming to solve large scale problems of the type arising in the engineering and 
economic spheres. 

4. Preliminaries on linear systems. 
known results concerning the solution of 
equations. These will be utilized in what follows. Proofs of the results cited here may 


In this section we shall mention some well- 


vector-matrix systems of linear differential 


be found in [6]. 
The linear system of (1.5) may be written, using an obvious vector-matrix notation, 
in the form 


lr 
7 = Ar + By, 2(0) = c. (4.1) 


Consider first the case in which A is constant. The solution of (1) may then be written 


in the form 
x = exp (Aljc + | exp [A(t — s)]By(s) ds. (4.2) 
If A = A(t), a matrix dependent upon t, then x may be written in the form 


ae 


z= X()e+ | X(t) X '(s)By(s) ds, (4.3) 
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where X(t) is the matrix solution of 


dX 


= A(x, X(0) = I. (4.4) 
dt 


We shall utilize these representations in a crucial manner below. 

5. Terminal control—General non-linear criterion. Let us now turn to the problem 
of maximizing a given function H[x,(T), x,(T), --- , 2,(T7)] of the terminal state of 
the system over all control functions y;(t) which are related to the x; by means of the 
linear equation in (4.1), and which are subject to the constraints 


(a) m,; < y(t) < mi, <= 35:< Zz. $=21,2,--- ,N, 
.7 (5.1) 


(b) | Gy: , Yo, °** » yn) at < k. 


We wish to show that the numerical solution of a problem of this type can be made 
to depend upon a sequence of functions of k variables, rather than upon sequences of 
functions of N variables. We shall consider first the case where A = (a;;) is a constant 


matrix. 
We begin with the linear representation of (4.2), which yields a set of equations 


af N 
a(t) = 2(t) + | | a(t — one | ds, += 1,2,--- ,N, (5.2) 
( }=1 


/0 


where z,(¢) is the 7th component of exp (Ad)c, and X(t) = [zx,;(0]. 
The problem we wish to consider may then be cast in the form of maximizing a 


functional of the type 


7 \ 7 Nv 
Hw t | | > x,(T - on | ds, +++ ,u,+ / b ui (T — ond | as), (5.3) 


where wu, , u - , u, are given quantities, over all functions y, , ¥2 , +++ , yy Satisfying 
the constraints (5.la) and (5.1b). 
Let us then consider the sequence of functions f(w, , uw , +--+ , u ; 7), implicitly 


dependent upon \, defined as follows. 


f »7 oT 
ea ee Ee re | Adin rit oie | [-+-] ds] 
a7 
—X | GY. ,Yo,°°° ss) as | 


where the functions y,(¢) are now constrained only by (5.1a). 

\ motivation and discussion of this use of the Lagrange multiplier may be found in 
[8], and a numerical example in [11]. The value of the method lies in the fact that it 
enables us to reduce multi-dimensional problems, where the dimensionality is reckoned 
in the dynamic programming sense, to sequences of lower dimensional problems. 

To obtain a functional equation for f(u, , u, +++ , u& 3; 7), we proceed as follows. 
Suppose that the values of y,(¢), y.(t), --- , yy(t) have been determined over [0, 6]. 
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Then we may write 


H( u, + [bles ee uy + [ t--1as) — fr Gy. , Y2,°** » Yn) ds 
= Hu +f t- last ft Bile os ut ft adie di sas) 


nb T ; 
-~ >} | Gyr »Y25°** » Yn) ds — rf Gy, Y2,°** » yw) ds (5.5) 
/J0 é 


y 


T-6 ] 
= H(w, + [t- ]ds + [ | > > 2(T — 6 — s)y,(s + | ds, x2] 
j=1 
8 T-é 
— [ Gy: , Yo, *** » yy) ds — X [ Glyi(s + 4), --- , yv(s + 8)] ds. 


The principle of optimality, see [1], then yields the functional equation 


ab 
flu; , U2, °*+ » ue 3 7T) = max Ee | Gy: » Yo, °*** , Yn) ds 
y[{0,8) 70 (5 65) 


+ us + + | | Sauce x,(T — — 9s | ds, +) I, 


7=1 


The maximum is now taken over all functions y;(s) defined over 0 < s < 6, and satisfying 
the constraints mi < y,(s) < m; in| 0, 6 
For computational purposes, we may use the approximate relation 


flu, , Ue, *** » uy 3 T) = max | -asce. » U2 5 °** 5 Uy) 


+ ilu +6 >> 2,,(T); , |] 


or we may start with a discrete version of the original process. 
We have thus reduced the numerical solution of the variational problem to the 
determination of a sequence of functions of k variables. If k = 1 or 2, we have a feasible 


method of solution. 
6. Terminal control—Quadratic criterion. Let us note that the problem of mini- 


mizing the functional 
Q[xi(T), 2.(T), --- , ty(T)] (6.1) 
over all functions y satisfying the relations 
dx 
dt 
can be reduced to the solution of systems of linear equations, if Q is a quadratic form in 


the z,(7); see [9]. 
The same holds if we add to Q a quadratic function of the forin 


= A(i)x + Biby, z(0) =, (6.2) 


aT 
| P(x, 9% 5 °°* »Xn Yr > Y25°°* » Yn) dt, (6.3) 


where P is quadratic in its arguments. 
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Problems of this type may also be very simply treated by means of the formalism 
of dynamic programming, a matter we will discuss elsewhere. 

7. Terminal control—Variable coefficients. It is important, in connection with our 
subsequent discussion of the use of successive approximations, to consider the same 
problem for the case where A is a variable matrix. Let us consider then the situation 
where the equation governing the process has the form 


& = Ade + Boy +4, 20) =e. (7.1) 


As we know, the solution of this equation is given by the expression 


z= a)e+ [ 22 "@BOy) ds + [ x(t)x7*(s)o(8) ds. (7.2) 
0 0 
Hence the components of x(7’) have the form 
T N 
2(T) =u + f b rlD Dud | ds, (7.3) 
0 i=1 


where the u; are independent of y. 

In order to take account of the non-stationarity of the process, we count time back- , 
wards. In place of noting the time at which the process ends, we single out the time at 
which it begins. Fixing 7, we consider the function f(u,; , u2, --- , u ; 7) defined by the 
relation 


T N 
flu, , Ue, °** » Ue 37) = max | aw + | Duran ds, | 
r 7=1 


vir. 7) (7.4) 
P 
-»/ Gy , Yo, *** 5 Yn) ds}. 
Arguing as in the preceding section, we see that f satisfies the relation 
r+é 
flu, , U2 ,°** »Ue5rT) = max | -. [ Gi. ,Y2,°** » yw) ds 
vir.r+8] dr (7.5) 
r+é r+é 
tila [dds pat fb das) | 
For computational purposes, this reduces to 
flu, , U2 ,°** » Ue 37) = max | -asce. » V2, °** , Wy) 
: (7.6) 


er ee | 


with f(u, , U2, --+,u;7) = 0. Herer = né,---,T = Mé. 

8. Terminal control—Linear criterion. Let us now consider the case where H is a 
linear function, which is to say, we consider the problem of maximizing the inner product 
[s(7’), a] where a is a given vector. To simplify the notation, let us consider only the 
case where A is a constant matrix. 
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Using the representation for x(t) given in (4.2), we see that 
oe 
[x(T), a] = (exp (AT )e, a) + | exp [A(T — s)]y(s) ds, a. (8.1) 


Neglecting the term [exp (AT7')c, a], which is independent of y, we have the problem 


maximizing 


»T 
J(y) = | exp [A(T — s)]y(s) ds, a| (8.2) 


over all 1; satisfying the constraints 


(a) msySm, OSt<sT, t=1,2,:--,N, 
; (8.3) 
Introduce the function 
(8.4) 


= max J(y), 


y 


f(k, T 


where the maximum is over all functions y satisfying (8.3a) and (8.3b), and & is as in 


(8.3b) 


It is easy reasoning as above to see that f(k, 7’) satisfies the equation 


f(k, T) = max ( [ exp [A(T — s)]ly ds, a) 
(8.5 


+ | - | GUY: » Yo, °** » Yn) a8, T — i]. f(k, 0) = 0. 


For computational purposes, we can use the relation 


f(k, T) = max [é(exp (AT)v, a) + f(k — 6G(v, ,v. , +--+ , vy), T — 8)). (8.6) 


We see then that in the case where the underlying equation is linear, and there is 


only one constraint of the form in (8.3b), we can compute the solution using sequences 
of functions of one variable. 
If there are two constraints of the type 


we introduce a Lagrang 


/ a| — X | Gy y y dt. (S.8) 

For each value of \ we have a one-dimensional problem. As the parameter } is varied, 
we range over a set of values of the constraint [4 G,(y; , y2, «+: , yw) ds. 

In some cases, if the constraint is that given in (8.3b), we can solve the problem 


analytically, see [9, 10], and thus eliminate all computational aspects. It can easily 
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happen that a direct numerical solution for a range of values of k and c consumes less 
time and effort than a solution based upon an explicit analytic expression. 

9. Time-lag processes. As the preceding sections show, the success of the method 
presented in the foregoing sections rested upon the superposition principle. Given an 


equation of the form 


L(x) = y, (9.1) 


Ling = C, 
we were able to write the solution in the form 
x= X(te + Tly), (9.2) 


thus avoiding any interaction between the initial state and the forcing function. 
Also important was the fact that 7'(y) had the form 


t 
T(y) = [ K(t, s)y(s) ds. (9.3) 
Jo 

It follows from this analysis, that the methods of the preceding sections are applicable 
to situations in which the underlying equations are differential-difference equations of 
the form given in (1.9). For the case where the coefficients are constant, Laplace trans- 
form methods yield the requisite representation formulas, see [7]. or the case of variable 
coefficients, these representation formulas have been developed in a forthcoming paper 
by the author and K. Cooke. 

10. Heat conduction processes. In the study of the control of thermal processes, 
and in connection with the recent field of nuclear reactor control, we encounter variational 
problems in which the underlying equation is 


u, — U,, = f(z, b), 0<2z<l, t> 0, 
u(x, 0) = e(zx), 0<2 <1, (10.1) 
“uO, = ul, 2 = 0, [> & 


Since the solution of this equation possesses the requisite properties described in 
the preceding section, it follows that a number of variational problems involving this 
equation can be treated by means of the foregoing techniques. A detailed discussion will 
appear subsequently. 

11. Successive approximations—I. Let us now briefly, without entering into any 
rigorous discussion, which as may be imagined is non-trivial, indicate how the method 
of successive approximations may be combined with the foregoing techniques so as to 
reduce the computational solution of general variational problems involving non-linear 
differential equations to sequences of computational problems involving functions of 
fewer than N variables. 

An important point to.emphasize is that modern digital computers enable us to use 
as single steps in a computatioual process the solutions of problems once considered 
formidable in their own right. 

Consider the problem of maximizing 


J(y) = H[x(T), xT), «++ , 7.(T)] (11.1) 
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over all y;(¢) subject to 


(a) 4 = G,(z, y), x(0) =¢; , $= 1,2,--- ,N, 
(b) m <y; < mi, +=1,2,--- WN, (11.2) 


(c) | Gy. ,Y2,°'' » yn dt<k. 


/0 


Let y(t) = [y?(d), y2(), --- , yx(] be an initial guess in policy space, satisfying 
(11.2b) and (11.2c), and let x°(t) = [x°(t), x8(t), --- , x2(t)] be the set of z-values deter- 


mined by (11.2a) when y(¢) is replaced by y"(t). 
Consider the new system of differential equations 


dz; : 0G 
—— a G(x’. y) A. (x, — r°) i (7°. )°) 
dt : wih > ” Ox; d 


N “fy 
+ > (y; -— y!) ad (x", y) ¢=1,2,--+ ,N. (11.3) 
i=l OY; 

The new variational problem is that of maximizing J(y) over all y satisfying (11.2b), 
(11.2c), and related to x by means of (11.3). 

Since the underlying ‘system in (11.3) is linear, this variational problem may be 
treated in terms of functions of k variables. Let the maximizing y(t) be called y’(é). 
Proceeding as above, we determine 2'(¢) using (11.2a), and consider the new approxi- 
mating linear equation 
dG; 
OY; 


dz, 
dt 


OG; oe 


N 
ae (hs y) + XS (y; - y}) (x', y'). (11.4) 
Or; j=l 


N 
= G,(z', y') + > (x; — 2;) 
7 1 
The new variational problem determines a vector y~ which enables us to determine 
a new state vector x by way of (11.2a). Continuing in this way, we obtain a sequence 
of vector functions {y"}, and a sequence of state vectors {x*}, which we hope converges 
to a solution to the original variational problem. 
12. Successive approximations—II. If the criterion function H[z,(T), x,(T 
z,(T)} is linear, 
Ha; 5 fe 5 *** 5 Ze) = Ce, D, (12.1) 


then the approximation procedure outlined above yields a problem which at each step 
can be resolved computationally in terms of sequences of functions of one variable, and 
even in explicit analytic terms under favorable circumstances. 

13. Terminal control vs. general control. We have placed considerable emphasis 
upon terminal control processes because of the fact that a simple transformation enables 
us to treat general control processes as terminal control processes. If 


aT 
J(y) | F(z, , Ze °** 5» En 301 192 °°* 1 Yu) a, (13.1) 
the introduction of a new variable, zy,,(t), determined by the relation 
dx. 1 ‘ ‘ 
ar i F(z, ,2%s,°** » nw 391 52s °** 9 Um> Zy.:(0) = 0, (13.2) 


yields a new problem in which we wish to maximize xy,.,(T). 
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Similarly, if we have the criterion function G[z,(T), 2.(T), --- , 2y(T)], a new variable, 


.,(t), determined by the relation 


dine, _ i OG dx; _ ~d OG — 
dt a p> Ox, dt = > ox, H (2, y), (13.3) 


once again reduces the variational problem to one of terminal control. 
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BOOK REVIEWS 


(Continued from p. 258) 


It is addressed to physicists, experimental as well as theoretical, who are familiar with the elements of 
quantum mechanics and who are approaching the subject of angular momentum for the first time. 

Accordingly, it is not surprising that the author, in his presentation, avoids group-theoretical and 
abstract algebraic methods. Instead, he relies upon an inductive approach to his subject which leans 
rather heavily on the assumption that the reader, from previous experience, already has an intuitive 
idea as to what is meant by “angular momentum” and “‘spin.’”’ There is an inconsistency here in this 
assumption which may not be out of place in a spoken lecture but which some, like the reviewer, may 
find annoying to see in a more formal treatment of the subject. 

The book is divided into two parts of approximately equal length, entitled General Theory and 
Applications. Under General Theory, one finds a terse review of some basic principles of quantum 
mechanics (which the reviewer feels could well have been omitted), a discussion of the angular-mo- 
mentum operators (these are defined in terms of the transformation properties of wave functions under 
rotations), together with chapters dealing with the addition of two angular momenta and the Clebsch- 
Gordan coefficients, the transformation properties of the angular-momentum wave functions under 
rotations, irreducible tensors and the Wigner-Eckart theorem, and, finally, the addition of three angular 
momenta and the Racah coefficients. Applications include chapters on the expansion of the electro- 
magnetic field into multipole fields, the multipole moments of static charge distributions, spin one-half 
particles, oriented nuclei and angular correlations, angular distributions in nuclear reactions, and wave 
functions for systems of identical particles. 

The discussion of the general theory is sufficiently detailed and up to date so as to give the reader 
a useful first approach to the theory of angular momentum. The applications are sufficiently varied so 
as to illustrate the considerable scope of the subject; it is inevitable, however, that the author cannot go 
into as much detail here as in the discussion of the general theory. It is for this reason that the second 
half of the book does not hang together nearly so well as the first part; it can, however, serve as a very 
useful point of departure to the periodical literature. 

Since it seems fairly certain that this book will be widely used as a first introduction to the subject, 
one would have hoped that certain inconsistencies and ambiguities, which may trouble the beginner, 
had been avoided. For example, on p. 16, there is the remark that, in quantum mechanics, “‘no more 
than one component operator (of the angular momentum) can be a constant of the motion.’ On p. 81, 
we have the statement that “‘a scalar, as the term is used here, may be a tensor of any rank.’’ Finally, 
the author consistently uses the term “projection quantum number’ in place of the old-fashioned 
“magnetic quantum number”’ except for the several instances where he forgets himself and reverts to 
the more conventional nomenclature. 

Davip FELDMAN 


A pplied group-theoretic and matrix methods. By Bryan Higman. Oxford University 
Press, New York (American Branch), 1955. xii + 454 pp. $9.60. 


This book is an outgrowth of lectures given at the University College of the Gold Coast, the object 
being to present the theory of matrices and group representations in a form palatable to chemists and 
physicists. The style of the book, which is quite unique, reveals its origin in a rather informal lecture 
series and the author probably succeeds in conveying the intuitive picture that goes along with the 
formal theory. The author, however, shows excessive disregard for mathematical rigor to the point of 
making rather vague statements of theorems. Most of the book is devoted to the applications in almost 
all the obvious branches of physics and chemistry. At places these are discussed in monotonous detail 
and one is left with the impression that the author, in amplifying his lecture notes, did not know where 


to stop. 
G. F,. NEWELL 


(Continued on p. 318) 
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—NOTES — 


EXTENSION OF MICHELL’S THEOREM TO PROBLEMS OF 
PLASTICITY AND CREEP* 


By BERNARD BUDIANSKY (Harvard University) 


A well known theorem of linear, isotropic elasticity due to J. H. Michell [1] gives 
the conditions under which the generalized plane stress distribution in a multiply- 
connected sheet subjected to prescribed boundary stresses is independent of Poisson’s 
ratio. Such independence exists if, and only if, the resultant force (not necessarily the 
couple) on each boundary vanishes. It is shown in this paper that the same independence 
of the elastic value of Poisson’s ratio holds under the same conditions even when plastic 
flow and creep occur’. 

Let oag(®, yi 1), €as(@, yj; t), and u,(x, y; t) be the time dependent stress, strain, and 
single-valued displacement distributions’ that constitute a solution to the following 
time-dependent boundary value problem for the region R bounded externally by the 


curve C, and internally by the curves C; (¢ = 1, 2, --- N). In R: 
CaBp.a = 0, (1) 
] 
fas = (1 + vous — VOyy5a8] + fas » (2) 
€ap = $(Ua.p + Us.a); (3) 
Un CU, : 
Tape = TT 5° («= 0,1,2,--- MN). (4) 


In this formulation / is Young’s modulus, v is Poisson’s ratio, n‘"’ is the unit outward 
normal to the boundary C, , and the 7’;"’ are prescribed, time-dependent distributions 
of boundary traction”. The functions f,3 (= fs.) represent the plasticity and creep 
contributions to the strain, and are permitted to depend on time and the detailed history 
of stress. The stress-strain relation (2) therefore represents the most general (isothermal) 
plasticity and creep law for elastically isotropic materials. 

Next, consider a different material obeying the same stress-strain law (2) with the 
exception that Poisson's ratio is replaced by v* ¥ v; the value of £, and the functional 
dependence of f,, on time and stress history are assumed to remain unchanged. To 
determine whether or not o,3 remains a solution of the same boundary value problem 
for the new material, it is only necessary to see whether the strains 


] - 
ee, = KB (1 + v*)oas — v*o, 508] + fas (5) 
are derivable by means of the strain-displacement equation (3) from some single-valued 


displacement u% 








*Received August 12, 1957. 

'In the case of elasticity, independence of Poisson’s ratio is equivalent (by dimensional analysis) 
to independence of all elastic constants; this is no longer true in the present case of plasticity and creep. 

2The usual summation convention and subscript notation for tensors and vectors are used here, 
with Greek subscripts taking on the values 1, 2. 

83Whether Michell’s conditions are fulfilled or not, the T° must, of course, satisfy conditions of 


equilibrium of the body as a whole. 
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Now let 
a8 = Cap — €x8 - (6) 
Then, from (2) and (5), 


* 
eS / 
tap = E om (Gap = Fey Sua (7) 


E 
Jj = uae // €.8Tag GA, (8) 
R 


where 7. is any distribution of stress satisfying equilibrium: 


Tapia = 0 (9) 





Next consider the integral 


and producing zero boundary tractions: 
Tats =O on C, (¢ = 0,1,2,--- N). (10) 


Now, if ¢,, is derivable from a single-valued displacement, it follows from the principle 
of virtual work that the integral J given by (8) vanishes. But a converse theorem is 
also valid; if J = 0 for all r,s satisfying (9) and (10), then ¢’,, is derivable from a single- 
valued displacement [2, 3]*. Substituting (7 


7) into (8) gives 
J = I (asTa8 — Taates) AA. (11) 
R 


The stresses 7,, can be related to a stress function ¢ by 
Tap = 9,775ap — 9,08 (12) 


and then (11) becomes 


Su - | ors (13) 
R 


As a consequence of the boundary conditions (10) on 7,3 , it follows that ¢., is single 
t) 


valued, and, furthermore, has a constant value, say K‘{"’, on each boundary (see, for 
example, [5], p. 191). It is therefore possible to transform (13) as follows: 


re 


J == ]/ ((c0s6.0).8 — (08.80.01 4A, 
"R 
N a 
- > K; ) Tapng AS, 


NV . 
= - DL K2P? dS, where P,? = $7: ds. 
+=0 
Cs 


‘The proof in [2] is specifically limited to simply-connected regions; the proof in [3], and the support- 
ing theorems contained in [4], are valid for multiply-connected regions, 
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Hence, if PS’ = 0 on each boundary, J vanishes for all admissible choices of rg . Hence, 
by the converse theorem mentioned above, ¢/,, is derivable from a single valued dis- 
placement. The same is then necessarily true of e*; = €as + €43, and hence o,g remains 
a solution for the stress when Poisson’s ratio is changed. On the other hand, if P‘ does 
not vanish on some boundaries, a suitable choice of r,, can always be made to render 
J non-zero. But this would necessarily imply that the strains ¢/,, (and hence e*,) are 
not derivable from a single valued displacement, whence o,,3 would certainly not consti- 
tute a solution for the new material. 

The present theorem can be useful in simplifying the initial formulation of some 
problems. For example, the choice vy = } in conjunction with a total stress-strain law of 
plasticity permits the use of a single formula for the sum of the elastic and plastic com- 
ponents of strain; in other problems, the choice »y = 0 might be more appropriate. In 
addition, the present theorem may conceivably have significance in connection with 
photoplasticity. 

Acknowledgment. The financial support of the Office of Naval Research under 
Contract Nonr 1866(02) is gratefully acknowledged. 
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ON ISOPERIMETRIC INEQUALITIES IN PLASTICITY* 
By WALTER SCHUMANN (Brown University) 


Abstract. The purpose of this paper is the proof of the inequality P > 61M, , where 
P is the total limit load, 1/, the yield moment of a thin, perfectly plastic, simply sup- 
ported, uniformly loaded plate of arbitrary shape and connection. 

Introduction. The theory of thin, rigid-perfectly plastic plates, given by Hopkins 
and Prager [1]** has been applied to circular plates with various load and edge con- 
ditions. However, if one tries to extend this theory to non-symmetrical cases, serious 
difficulties arise in seeking examples of exact solutions, although some cases have been 
solved (see for instance [2]). As a contribution to the estimation of the limit load in an 
arbitrary plate we shall use here the isoperimetric inequality, which relates a circular 
domain to an arbitrary domain in a convenient manner. One of the principal theorems 
of limit analysis [3] and the methods for isoperimetric problems given in Polya’s and 
Szeg6’s book [4] will be used. Similar problems have been proposed and solved for other 
physical quantities, as for example the torsional rigidity, the principal frequency, etc. 


*Received August 16, 1957. The results presented in this paper were obtained in the course of re- 
search conducted under Contract Nonr 562(10) by the Office of Naval Research and Brown University. 
**Numbers in square brackets refer to the bibliography at the end of the paper. 
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2. A lower bound for the limit load of a simply supported plate of arbitrary shape 


under uniform load. Let us consider a simply supported plate of a rigid-perfectly 
plastic material obeying Tresca’s yield condition (Fig. 1), and let p be the limit load 


v.Mises yield condition 8 es 
aq 









Tresca's yield condition 














per unit area, which is assumed to be constant, A the area of the domain G (Tig. 2), 
P = pA the total limit load and M, the yield moment. Then the following inequality 
is true: 
P > 6rM, , (1) 
where the equality sign holds only in the case of the circular plate. 
To prove this statement, we consider, in addition to the domain G, a circular domain 
G’ of equal area A (Fig. 2), and we map the actual velocity field v of G into a new field 
v’ over G’ in a certain way, that will be defined later. All further quantities for the new 


simply supported 





Fig. 2. Schwarz’ symmetrization: A, = A;’. 


field will be distinguished by primes from the corresponding quantities of the original 
field. Denoting by D the dissipation function per unit area, we have 


I pdA = I/ DdA. (2) 


G 


1958} WALTER SCHUMANN 311 


Using the second theorem of limit analysis [3] and assuming that v’ is kinematically 
admissible for G’, we have 
I| pv’ dA’ < I/ D’ dA’. (3) 
iJ iJ 
Suppose for the moment v > 0, and let D,,, be the total rate of dissipation, V the volume 
between the plane of the plate and the surface into which the plate deforms; then (2) 


and (3) give 


Di. Di. 
P = "4 A, = Vr A. (4) 
To get the inequality (1), we look whether there exists a mapping v — v’ such that 
D Di 
— > =F (5 
,- 2 5) 
For this purpose let us first investigate the dissipation function D, which is (see [6], 
p. 50) 
M, | | | | | 1 » 
D = 9 (| Ky “he Ko | a Ki + Ke l), (6) 


where «x, and x, are the principal rates of curvature associated with the velocity field 
v. For domains of (positive) elliptic and parabolic curvature corresponding to the regimes 
A, AB and AF of the yield hexagon we have 


D = M(x, + «:) = 2M.H = —M, V’ 2, (7) 


where H is the rate of the mean curvature, and v is counted positive when directed 
downwards. On the other hand we may write for all regimes 


ix: |, D> 2MH. (8) 


D > M, max 


We shall later use the fact that (8) is valid at every point of the field v, even if hinges 
occur. A hinge line may be considered as a narrow strip, where one of the rates of curva- 
ture is very large and the other finite. 

Finally, we note from (8) and Green’s formula, that v > 0 everywhere, since a 


domain with » < 0 can be removed by v* = 0, thus diminishing D,,,/ff vdA, so that 
v < 0 cannot be the actual field. 
Denote now by C, the contour line v = p of the surface v = v(x, y) (Fig. 2). We note 


that C, may consist of several branches. Let «, be the rate of curvature tangential to 
the contour line C, , « the curvature of the contour line in its plane, dg = « ds the incre- 
ment of the angle of the tangent at C, , when a point of v moves an increment ds on C, , 
and finally let 8/dn denote differentiation normal to C, into its “‘interior’’, i.e. the direc- 
tion of increasing p. We integrate the dissipation function D over an infinitestimal 
strip between C, and C,,2, , Which gives, by using (8), 


> 


dD... > Mo D max | x; | dnds > M, $ | x, | dnds 
° z ) 


2r 
2 $ at = dn ds > M, | dp dp = eM, dp. 


Cp 
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The equality sign in the third inequality of (9) holds, when C, is convex and consists 
only of one branch. 

After these preparations we now specify the mapping of v in two steps as follows: 
v(G@) — v’’(G’), v”’(G’) — v'(G’). The first transformation is the so-called Schwarz’ sym- 
metrization (see [4], p. 190 or [5]). Let C?’ be the contour line of v’’, which corresponds 
to C, (same height p), and let A, and A?’ be the areas, which C, and C?’ “surround”’ 
(increasing p). Schwarz’ symmetrization is then defined as follows 

I. C?’ is a circle, concentric in G’. 

II. The areas A, and A?’ are equal. 

It is easy to see, that Schwarz’ symmetrization does not change the volume V. However 
D.., might be, at least in certain cases, increased. We introduce therefore a further 
transformation. The velocity field v’’ consists, because of the rotational symmetry, of 
several ring-shaped circular zones of elliptic, parabolic and hyperbolic rate of curvature 
(Fig. 3). We replace the body between v”’ and the plane of G’ by its convex hull (surface 
v’ indicated by the dotted lines in Fig. 3), which has only elliptic and parabolic curvature. 





| simply supported 
P=0 | i 
‘ : 7 
‘ Hey’ 








axis of symmetry =| 
Fia. 3. 
Let C;. be the smallest contour circle, such that outside C/. no elliptic curvature exists. 


The transformation has then the following properties 
a) The volume is increased: V’ > V”. 


b) Ci ison v’ andon v”; 





Aj. = Al=A,. 
ov’ ov’! 
oS ===), tor = p*; 
) an oe eee 4 
dA, dA?’ dA, 
a aw. fo = p*. 
dp dp dp’ aig p 
As outside C/. , D’ = Mox! , we shall have the equality signs in (9) there. As on the 


other hand the inequalities (9) are valid everywhere in G, we obtain 
Puan oe Bene OH OK PK. (10) 


“Inside” C,. , the second inequality (8) can be applied, which gives 


in a I Vdd = M, $ = as, (11) 
on 
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and inside C’. we may apply Eq. (7) (regimes A, AB) 


Dictse>oe = —Mo I V’v' dA’ = M, A ds’. (12) 
on 
p2p* Cc 


‘pe 


As dv/dn > 0, dv'/dn’ > O, per definition of the contour lines, there exists an important 
inequality between the last terms in (11) and (12), which is actually the key point of 
the proof. It follows namely from Schwarz’ inequality for dv/dn and (dv/dn)~* and from 
the isoperimetric inequality 4rA,. < U. , where l,. is the length of C,. (see [4], p. 234) 


ad -.. r , 
) “ ds > sa $ x ds’. 
J an dA, | an (13) 
Cpe C’ pe 
Ap | p=ps 
Therefore we obtain 
Dos DO, ar oa v. (14) 


From (14) and (10) we conclude, that D,,,/V, and also P, are not increased by the 
mapping G — G’, and as 67M, is actually the limit load of the circular plate (see [6], p. 
55), inequality (1) is proved. 

It remains to show that the equality sign in (1) is valid only in the case of the circular 
plate. The equality sign in (13) holds only when C,. is a circle, and when dv/dn is 
constant. The equality signs in the two last inequalities of (9) hold, when every contour 
line outside C,. consists of one convex branch and is a line of principal curvature, which 
means that d°v/dnds = 0. As one of them, namely C,. , is circular, they must all be 
circular; therefore, the edge is a circle. 

3. v. Mises’ yield condition. If one takes v. Mises’ yield criterion instead of Tresca’s 
condition, the limit load is not diminished, because the ellipse surrounds the hexagon 
[7] (Fig. 1). Therefore the inequality (1) remains true 


P (v. Mises) > 67M, . (15) 


However the inequality is not isoperimetric. 

4. Minimum weight design of a sandwich plate. As there is a certain duality be- 
tween analysis and design problems [8], we expect also an isoperimetric inequality in 
the latter case. However the result is less useful, because a bound for the minimum 
volume, for example, does not help in finding the actual design. Nevertheless let us 
look at a sandwich plate of variable thickness h of the sheets, but constant thickness 
H,, of the core, with a homogeneous material obeying Tresca’s yield criterion. The yield 
moment is given by 

M, = ohH, , (16) 
where a is the vield stress. Looking for a statically admissible stress field with regime 


A of the hexagon (Vig. 1), Prager [9] has shown that h must satisfy the equation 


Vh= ——, (17) 


and he mentioned the analogy between this problem and that of the membrane. The 
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volume of the sheets 


¥ siatveatty admissible — 2 I h dA (18) 


ve 


G 
corresponds to the torsional rigidity, if we take the torsion analogy instead of the 
membrane analogy. For the torsional rigidity the isoperimetric inequality has been 
proved long ago; thus we may write, by using the first theorem of limit analysis, 


< pA 


statically admissible — J circle — how H. 
‘ 0 0 


Y < (19) 

5. Steiner’s symmetrization. In the case of a very long but narrow domain G, the 
isoperimetric inequality gives a very bad bound. Steiner’s symmetrization (lig. 4, see 
also [10]) does not change G so much as Schwarz’ symmetrization, if the axis is chosen 
conveniently. Steiner’s symmetrization, which increases the torsional rigidity, therefore 





- oxis of symmetry 
Fig. 4. Steiner’s symmetrization. 


increases also the volume of a design given in (18), and probably diminishes (or leaves 
constant) the limit load of a simply supported, uniformly loaded plate. Unfortunately, 
we are not able to prove the last of these two statements, which would be useful. 


BIBLIOGRAPHY 


H. G. Hopkins and W . Prager, The load carry ing capacities of circular plates, a Mech. Phy Ss. Solids 
2, 1-13 (1953 

2. R. M. Haythornthwaite and R. T. Shield, A note on the deformable region in a rigid plastic struc- 
ture, Brown University Tech. Rep. C11-—26 (1957 

3. D. C. Drucker, W. Prager and H. J. Greenberg, Extended limit design theorems of continuous 
media, Quart. Appl. Math. 9, 381-389 (1952 

4. G. Polya and G. Szegé, Isoperimetric inequalities in mathematical physics, Princeton (1951 

5. H. A. Schwarz, Gesammelte Math. Abhandlungen, vol. 2, Géttingen, 327-335, 1890 

6. W. Prager, Probleme der Plastizitatstheorie, Basel, 1955 

7. R. Hill, A note on estimating yield-point loads in a plastic-rigid body, Phil. Mag. 43, 353-355 (1952) 

8. D.C. Drucker and R. T. Shield, Design for minimum weight, Proc. 9th Intern. Congr. Appl. Mech., 
Brussel, 1956 

9. W. Prager, Minimum weight design of plates, De Ingenieur 48 (1955) 


10. J. Steiner, Einfache Beweise der isoperimetrischen Hauptsitze, Werke II, Berlin, 75-91, 1882 


ae 





AFR RATIO AR en ager 





1958] R. M. DURSTINE AND D. H. SHAFFER 315 


DETERMINATION OF UPPER AND LOWER BOUNDS FOR SOLUTIONS TO 
LINEAR DIFFERENTIAL EQUATIONS* 


By R. M. DURSTINE anv D. H. SHAFFER (Westinghouse Research Laboratories, Pittsburgh, Penn.) 


Analytic upper and lower bounds may be constructed on the solutions to linear 
differential systems of a certain kind, as described in this paper. The principal require- 
ment of the method presented here is that two functions satisfying the boundary con- 
ditions placed on the system—and certain other conditions—must first be constructed. 
The bounds are then generated as linear combinations of these two functions. The 
development proceeds as follows. 

Consider a linear differential equation 

Lu) +¢=0 
defined in a suitable region D with boundary B, and subject to the linear conditions 
M(u) = d, on B&B, ] tt ae q; 
where L and J/, are homogeneous linear differential operators, and q is appropriate to 
the particular problem. Here ¢ is taken as a known piecewise continuous function in D. 
The functions \, are known on B. 

Now select a function w which has the following properties: 

|) w is piecewise continuous in the highest derivatives appearing in L; 

2) M;(w A; on B, 7 i aoe | 


Then define € as 


If 2 u — w, then 
Lia —e in D 
M,(v) = 0 on B; 7 =1,2,-°- ,q. 
It is assumed that the original system was such that the above problem has a unique 
solution, and furthermore possesses a Green’s function which does not change sign at 
any point of D. We have then 


. 
v | G(x, E)e(E) dé, 
D 


where x and £ represent points of the space in which we are working, regardless of dimen- 
S1on 

Let w, and w, be two functions, each satisfying the conditions previously imposed 
on w. Corresponding to these are two functions ¢,; and e, . It is required in addition that 

1) ee, be continuous in D: 

2) «, not change sign in D; 

3) one of the following inequalities holds 


1> M > .e/e, > m 


or 


M > «/e > m> 1. 


*Received June 26, 1957. 
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We have then 


(a. £)e, ¢ 
u — Ww, - [ Gla, SealE) de 


u— U; 


| G(x, Be(€) dé 
D 


Application of a mean value theorem to the right hand side of this expression gives 


Eicaeatae 
: | e,[€:(x)] ’ 
G(a, £)e(€) dé 

7D 


where é, is an interior point of D, and clearly a function of x. For brevity we denote 
e[t,(x)] by 6(x). Thus we have 
“ — Ws 
u— W, 


Rearrangement of this formula gives 


u=w, + (w, — ws) / (® _ 1). 
0; 


If the exact forms of 6, and 6, were known, this would give the solution to the original 
problem. However, even though these forms are not known, the ratio is, by hypothesis, 


mS 


bounded. 

Since ¢,/¢, is bounded away from unity in D it follows that w, — w, is always of 
one sign in D, as will be determined by the sign of G. So the second term in the above 
expression for u may be bounded in D by replacing the denominator by its minimum 
and maximum values. Define u, and u, by 

u, = WwW, + (vw, mre w.)/(M — 1) 


Us = UW; + (WwW, — W2)/(m — 1). 


The true value of u must lie between wu, and uz . 
The “efficiency” of these bounds may be defined as the spread between them, given by 


ea 4 : , { M-—=m : | 
en we (m — 1)(M —1)] 


This formula shows the curious fact that very good bounds may be obtained from func- 
tions w, and w, which are themselves poor approximations to u. 
As one example of this method consider the Bessel equation 
u’ +u’/e + 4u=0 
inO < x < 1, subject to u(0) = 1. We use the 0 < x < 1 range to satisfy the conditions 
imposed on the Green’s function. The solution to this problem is u = J,(2zr). As approxi- 
mating functions, we take 
w, = 1 


Ww, = 1 — 1.11672* + .35702’. 
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Associated with these are 


«= 4 
€ = —.467 + 3.2132 — 4.4672? + 1.428’. 
From these we compute 
m = —.11675 
M = .05125 


and hence 

1.054w, — .0540 < J,(2xz) < .8955w. + .1045 
which produces a rather accurate pair of bounds, particularly near the origin. The func- 
tion w, was of course itself a good approximation to J,(2z), chosen by another approxi- 


mation scheme. 
As another example consider 


Ay = -2 
in the region interior to x*/a’ + y’/b’ = 1 subject to y = 0 on the boundary. Choose 
w, = 1 — 2*/a’ — y’/d? 
w, = 0. 


The conditions requisite for the application of this method are fulfilled and we have 
e& = 2 _ 2/a? _ 2/b’ 
c& = 2. 
Since both errors are constant, it follows that the precise form of 6,/6, is known. Hence 
instead of bounds we obtain the exact solution 
vy = (a’b’ — b’x”? — a’y’)/(a’ + 0’). 
This is the classical problem of the torsion of an elliptic cylinder. 
It should be noted in passing that the results presented here are obtainable under 
the weaker assumption of only piecewise continuity for e./e, . 
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(Continued from p. 306) 


The hypercircle in mathematical physics. By J. L. Synge. Cambridge University Press, 
Cambridge, 1957. xii + 424 pp. $13.50. 


In 1947, the author and W. Prager collaborated on a paper entitled, Approximations in elasticity 
based on the conce pt of function space, in which the ‘“‘method of the hypercircle’’ was developed to furnish 
approximate solutions to boundary value problems of elasticity theory. At the time of this research, 
the reviewer was privileged to read the correspondence between the authors. It is especially gratifying, 
therefore, to read the present book which represents a summation of over a decade of work and thought 
during which the ideas developed and new applications materialized. 

The hi percire le method is one of functional approximation in Hilbert space, Given a boundary value 
problem, 1 Tur ion space is defined with an appropriate metric. The solution of the proble m is charac- 
terized as the intersection of two orthogonal linear subspaces. This is typified in the Dirichlet problem 
where one of the subsp ices consists of all vector fields obtained as gradients of functions satisfying the 


5 


boundary conditions, the other subspace consists of all divergence free vector fields. The metric is the 
Dirichlet integral. By choosing finite sets of particular vectors one constructs a finite dimensional sub- 


space in each of the given subspaces. Minimizing the metric distance between the subspaces determines 


, 


a pair of ve ‘vertices.’’ The solution of the problem lies on a hypercircle having the segment 
joining the vertices in the space as diameter. This gives upper and lower bounds for the norm of the 
solution. Knowing the center and radius of this hy percircle, one takes the function associated with the 
center as an approximation to the unknown solution with error measured in the sense of the metrie by 
the radius. Increasing the dimension of the subspaces reduces the error and improves the approximation 


In certain problems, such as determining the torsional rigidity of a prismatic bar the norm of the 
solution function is itself of inte rest. This can b 


e accurately determined with precise error estimates. The 
function associated with the center of the hy percircle can be evaluated to yield pointwise approximation 
in the physical domain to the solution, e.g. in the torsion problem to yield approximate stress and warp- 
ing values. Although the pointwise accuracy ol the approximation at any stage is unknown, convergence 
in the sense of the metric implies pointwise convergence. However, the method can be extended to de- 
termine point bounds on the solution and on its derivatives. This is done by bounding the projection 
of the solution vector on a sper ially chosen vector which plays the role of a Green’s function, although 
no singularity appears 

Turning to the practical computational problem the author has devised a systematic way of con- 
structing subspaces of increasing dimension so as to contain functions which approximate the solution 
and its first derivatives to any desired accuracy. These functions are termed “pyramid functions.” 
For a boundary value problem in a domain in two dimensions, such a pyramid function corresponds to a 
polyhe dro ynstructed over a tri iungul irization of the plane domain with the vertices of the polyhedron 
located above the vertices in the domain 

The book is divided into three parts. In Part I, elementary properties of linear function space are 
explored which do not depend on metric 

Part IT is devoted to the consequences of the introduction of a positive definite metric and occupies 
the bulk of the book. In this part the hypercircle method is developed, together with the computational 


method of the imid functions. The results are pplied to various boundary value probl ms of mathe- 





matical phys heginning th the Dirichlet and associated problems in potential theor Particu- 
larly detailed calculations are carried out for the problem of determining the torsional rigidity and stress 


and warping of regular hexagonal and hollow square sections. Excellent bounds are obtained. As ex- 


amples of mixed boundat \ ilue problems the author carries out calculations for the flow o viscous 
fluid through a semi-ci1 iv channel and the deflection of an elastically supported triangular mem- 
brane under uniform pressure. Other boundary ilue problems for which the machinery of the hyper- 


circle method is set ip are those of equilibri im of a three dimensional elastic body and problems 


associated with the bihart ni equation in hydrodynamics and elasticity. 

Part III considers the implications of an indefinite metric in the function space. The hypercircle 
becomes a ‘‘pseudohypercircle’’ and due to the presence of non-vanishing null vectors in the space, 
reducing the 


lius of the hypercircle to zero will not insure convergence of the approximating functions 
to the solution. H« 


those cases where the metric is the difference of two positive-definite met- 
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trics, as in Minkowski space-time, projections of the hypercircle on the “space-like’’ and ‘“‘time-like” 
subspaces yield certain bounds. Minkowski type metrics arise in problems of vibration. The method of 
the hypercircle is discussed for the cases of forced elastic and electromagnetic vibrations. 

The book is written in the characteristically lucid and interesting style of the author. It is intended 
for the student as well as the specialist and for the engineer or physicist as well as the mathematician. 
The pace is leisurely and considerable effort and ingenuity are combined to make the geometry of func- 
tion space as familiar to the novice as ordinary Euclidian 3-space. There are a good number of interest- 
ing problems to be worked out by the reader. 

The power of the hypercircle method as seen by the author lies in its wedding of geometry to 
analysis enabling one to visualize and use intuition to suggest both valid theorems and their proofs. 
In his introduction, the author refers to other treatments of the problem of bounding solutions of 
boundary value problems which proceed without diagrams or geometrical ideas. These he recommends 
to readers who “prefer to take their analysis neat.’’ This reviewer, however, heartily recommends to 
all readers the stimulating mixture which Professor Synge has concocted for us. 

H. J. GREENBERG 


Nonparametric methods in statistics. By D. A. 8. Fraser. John Wiley & Sons, Inc., New 
York, and Chapman & Hall, Ltd., London, 1957. x + 299 pp. $8.50. 


The title of this book is somewhat misleading; the reader who wants a comprehensive collection of 
nonparametric statistical methods will not find it here. The author’s aim seems rather to have been to, 
give a theoretical treatment of those nonparametric techniques which admit some theoretical justifi- 
cation, and in the state of knowledge extant when the book was written this meant excluding mention 
of many important problems and many useful and commonly employed techniques. 

In the first two chapters (almost a half) of the book, the author gives a general introduction to 
statistical inference which for the most part follows lines of development similar to those employed by 
E. L. Lehmann in his mimeographed Berkeley notes. Many of the examples here are the standard 
parametric ones. Thus, the reader will find an introduction to statistical decision theory and such topics 
as sufficiency, unbiasedness, and invariance in estimation and testing hypotheses. However, the reader 
who believes the prefatory remark that the calculus and Hoel are the only prerequisites, may find the 
rapid measure-theoretic excursion of Chapter 1 overwhelming; and the professional mathematician 
may be a bit disturbed that the measure-theoretic care is dispensed with at the start of the next chapter 
and that most of the hard and interesting proofs are omitted while those given are usually trivial. The 
omissions can perhaps be justified on the grounds that these generalities are not the main content of 
the book 

The last five chapters deal with various nonparametric problems and properties of certain pro- 
cedures. especially tolerance regions, unbiased estimators, and rank tests. Some of the emphasized 
topics are treated rather completely, e.g., limit theorems concerning procedures of the last two types 
just mentioned, as well as the common conditional and run tests. As has been mentioned, many im- 
portant topics are unfortunately omitted entirely; among these may be mentioned the use of the sample 
quantiles and sample distribution function in estimation and testing hypotheses. The x? techniques 
receive only a brief historical mention. 

Problems at the end of chapters supplement and extend the text. Within the framework of the 
latter, they should be instructive to students. 

J. KiEFER 


An introduction to matrix tensor methods in theoretical and applied mathematics. By Sidney 
IF. Borg. J. W. Edwards, Publisher, Inc., Ann Arbor, Michigan, 1956. 202 pp. $4.75. 


The reviewer could find no excuse for the publication of this book, which contains many examples 
of erroneous thinking in both mathematics and physics. To give only one example, on page 61 the follow- 
ing statement can be found. “In general, the requirements that a function u(z,y) be continuous at a 
point (2 ,Yo ) are that all partial derivatives be finite and continuous at (20 ,yo ) and that the remainder 
term of the Taylor Series approach zero as the number of terms increases.”’ 

R. T. Surevp. 
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Economic models: an exposition. By E. F. Beach. John Wiley & Sons, Inc., New York, 


and Chapman & Hall, Ltd., London, 1957. xi + 227 pp. $7.50. 


For the non-genius, a knowledge of mathematical techniques is a sine-qua-non for work in economic 
theory; and it is highly desirable for the economic theorist to learn some pure mathematics. To be sure, 
& genius can get by on his high school plane geometry; but I venture the guess that the complexity of 
problems is growing at a faster rate than the mental capacity of human beings. As a result, it will soon 
be necessary for even a genius to learn mathematical techniques. Professor Beach realizes that ‘‘many 
students become interested in economic theory only after they have left the study of mathematics at 
an early stage some years previously. It is rather difficult for them to return to the study of mathe- 
matics when they are rather fully occupied with the requirements of advanced degrees.’’ When the 
student finishes reading this book, Professor Beach believes, he should know whether he wants to delve 
deeper into the literature of mathematical economics and statistics. 

An applied mathematician will not learn economics or mathematical economics from this book. 
If he is innocent of statistics, he could read part two with profit. The author provides a lucid and in- 
teresting introduction to sampling theory, simple and multivariate regression theory and the book con- 
cludes with a very useful chapter on the Cowles Commission’s Simultaneous Equations Approach. 
Koopmans’ famous article on Identification Problems in Economic Model Construction is discussed in 
this chapter, and it can be read with profit by many students. A bibliography is provided for those whose 
appetites for statistics have been whetted. 

The cardinal deficiency of this book is that it fails to concentrate on a few economic problems and 
show how mathematical techniques enable the economists to derive a more profound understanding of 
the economic processes at work. For example, Beach tells the student that Hicks and Modigliani enabled 
economists to understand Keynes’ contribution more fully, when they translated his model into mathe- 
matical terms. Beach then writes down their equations, but neglects to show the student how the 
mathematical model is more revealing than the literary model. Hicks translated Keynes into mathe- 
matics and then described his mathematical results in terms of two dimensional geometry; then, Hicks 
translated the geometry back into literary economics. Thereby, the mathematical reader, the semi- 
literate mathematical reader and the literary economist could benefit from Hicks’ mathematics. Beach, 
on the other hand, merely writes down Hicks’ equations and leaves it there. In the beginning of chapter 
three Beach interweaves algebra and geometry (familiar to economists) in a lucid and interesting 
manner. The student is lead back and forth between familiar geometry and unfamiliar algebra; and is 
thereby taught some algebra. Unfortunately, this method of exposition is not continued. 

As the book progresses the exposition deteriorates in quality. The student is given an introduction 
to the derivative and is taught that Dx” = nz*—!, Then twenty pages later the author begins to solve 
differential equations, using the term integration and techniques of integration; but he never taught 
the student anything about an integral or integral calculus. A student who was familiar with integral 
calculus would not need to study this book; and one unfamiliar with integrals would be mystified by 
his chapter on continuous dynamic models. 

Parts of this book are excellent introductions to the use of mathematical techniques in economics. 
Moreover, it is lucid and interesting. I have no doubt that Beach’s major purpose will be satisfied: the 
student will learn whether or not he wants to learn mathematical economics. 

J. L. STEIN 


Mathematical analysis—a modern approach to advanced calculus. By Tom M. Apostol. 
Addison-Wesley Publishing Co., Inc., Reading, Mass., 1957. xii + 553 pp. $8.50. 


The author properly describes his book by the statement in the preface, “. . . most of the topics 
which usually fall under the heading ‘Advanced Calculus’ are treated in this book. The author’s aim 
has been to provide a development of the subject matter which is honest, vigorous, up-to-date, and, at 
the same time not too pedantic.’’ The book is designed primarily as a text for (pure) mathematics 
students but is at a level of difficulty somewhat below the usual advanced analysis. The book is written 
in a clear leisurely style with ample foreshadowing and motivation of the developments even though 
there is practically no discussion of applications. 
G. NEWELL 
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HANDBOOK OF AUTOMATION, 
COMPUTATION, AND CONTROL 


Prepared by a staff of 106 specialists 


Edited by EUGENE M. GRABBE, SIMON RAMO, and DEAN E,. WOOLD 
RIDGE, all of the Ramo-Wooldridge Corp. Written and edited with systems engi 
neering emphasis in mind, the handbook covers material of direct use to all levels 
of technical personnel, in all areas of control. The major objective is to provide 
practical data for research, development, and design in the fields of feedback control, 
computers, data processing, control components, and control systems. 

Throughout, the stress is on new techniques and components for designing and 
building digital devices, making measurements, and developing control systems. 
Much of this material has not been available previously in handbook form. 

Each chapter starts with definitions and descriptions, then builds up to more 
complex details. Thus the handbook provides both an elementary frame of reference 


for the novice and advanced data tor the expert. 


Volume I—-CONTROL FUNDAMENTALS 


e Covers aspects of mathematics, as applied to control. 


e Includes a compilation of numerical analyses for digital computers—the latest 
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e Presents self-contained treatments of feedback control theory and of operations 
research. 


e Gives essential material on information theory and data transmission, 


1958. Over 1000 pages. $17.00. 
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Applied Mechanics Reviews at the request of the Office of Naval Research, have the 
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cover. The surveys cover the international literature, and pay special attention to 
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